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1 Matrices and linear transformations 

As usual, R and C denote the real and complex numbers, respectively. If 
z~x + iy is a. complex number, with x, y £ R, then the complex conjugate of 
z is denoted 'z and defined by 

(1.1) 1 — X — iy. 
Notice that 

(1.2) z + w='z + w 
and 

(1.3) z w = 'zW 

for complex numbers z, w. 

If m, n are positive integers, we shall denote by £(R"',R") the space of 
real-linear mappings from R™ to R", and by £{C™, C") the space of complex- 
linear mappings from C" to C". In the special case where m — n, we may 
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simply write £(R"), £(C"), respectively. Also when m = n, we write I for the 
identity mapping on R" or C", as appropriate. 

Using the standard basis for real and complex Euclidean spaces, linear trans- 
formations can be identified with matrices in the usual manner. Let us write 
lV[r(m,n) and Mc(TO,n) for the spaces of m x n real and complex matrices, 
respectively. Thus £(R'",R"), £c(C",C") can be identified with Mr{m,n), 
]V[c(m, n), respectively, and in particular addition and scalar multiplication of 
linear transformations corresponds to componentwise addition and scalar mul- 
tiplication of matrices. 

When m = n we write M,.(n) and Mc(n) for the spaces of n x n real and 
complex matrices, respectively. Two elements of Mr(n) or of Mc(n) can be mul- 
tiplied in the customary manner of "matrix multiplication" , which corresponds 
exactly to composition of the associated linear transformations on R" or C". 
The matrix associated to the identity transformation / has Ts along the diago- 
nal and O's elsewhere, and the product of this matrix with another matrix gives 
back that other matrix, just as the composition of the identity transformation 
with another transformation gives back that other transformation. 

If a; = (a;i, . . . , a;„), y = {yi, . . . , i/„) are elements of R", then their inner 
product is denoted {x, y) and is defined by 

n 

(1.4) {x,y) = Y,Xjyj. 

In the complex case, the inner product of two vectors z = {zi,...,Zn) and 
w = {wi, . . . , w„) is also denoted {z, w) and is defined by 

n 

(1.5) {z,w) =^ZjWJ. 

In both cases the standard Euclidean norm of an element v of R" or C" is 
denoted \v\ and is defined to be the nonnegative real number such that 

(1.6) \v\^ = {v^v). 

Given a linear transformation T on R" or C", there is a unique linear 
transformation T* on the same space such that 

(1.7) {T*{v),w) = {v,Tiw)) 

for all V, w in R" or C", as appropriate. This linear transformation T* is called 
the adjoint of T. In the real case, the matrix associated to T* is the transpose 
of the matrix associated to T, which is to say that the (j, I) component of the 
matrix associated to T* is equal to the component of the matrix associated 
to T, 1 < j, I < n, and in the complex case the matrix associated to T* can be 
obtained by taking the complex conjugates of the entries of the transpose of the 
matrix associated to T. 
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A linear transformation T on R" or C" is said to be self-adjoint ii T = T* , 
and the space of self-adjoint linear transformations on R", C" is denoted 5(R"), 
5(C"), respectively. The identity transformation is self-adjoint, and if Ti, T2 
are elements of iS(R") or of 5(C"), and if ri, r2 are real numbers, then the 
linear combination 

(1.8) riTi+rsTs 

is also an element of <S(R") or <S(C"), respectively. Note that it is important 
to use real numbers as scalars here even if one is working with linear transfor- 
mations on C". 

Let us write Sr{n), Sc{n) for the spaces of real and complex nxn matrices, 

respectively, that correspond to self-adjoint linear transformations. Thus Sr{n) 
consists of the matrices in ]V[r(n) which are symmetric, in the sense that the 
{j,l) and entries are equal to each other. Similarly, <Sc(n) consists of the 
matrices in ]VIc(n) such that the entry is equal to the complex conjugate 
of the (j, I) entry, and in particular so that the diagonal or (j, j) entries are real 
numbers. 

If a;,y G R", then 
(1-9) {y,x) = {x,y), 

while if z,w G C", then 

(1.10) {w,z)=J^. 

As a consequence, if T is a self-adjoint linear transformation on C", and v is an 
element of C", then 

(1.11) {T{v),v) = {v, T{v)) = {T{v), v), 

so that {T{v),v) is a real number. Conversely, if T is a linear transformation 
on C" and {T{v),v) is a real number for all v G C", then T is self-adjoint. 

In fact, in the complex linear transformation T on C" can always be 

expressed as -|- i52, where Si, S2 are self-adjoint linear transformations on 
C". Namely, one can take Si = {T + T*)/2 and S2 = {T - T*)/(2i). It is easy 
to see that {T{v),v) is real for all v G C" if and only if {S2{v),v) = for all 
V G C". 

In both the real and complex cases we have the following fact. Suppose that 
5* is a self-adjoint linear transformation on R" or C" such that 

(1.12) {S{v),v)^0 

for all V in R" or C", as appropriate. Then S is equal to the zero linear 

transformation. 

More generally, suppose that S* is a self-adjoint linear transformation on R" 
or C", and that v is an element of R" or C" such that |u| = 1 and 

(1.13) {S{w),w) 

is maximized, or minimized, or has a critical point at v, as a function on the 
unit sphere, which consists of the vectors w such that |w| = 1. As in vector 
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calculus, one can check that v is an eigenvector for S, in the sense that there is 
a scalar A such that 

(1.14) S{v) = \v. 

This scalar A is called the eigenvalue of S associated to the eigenvector v, and 
for a sclf-adjoint linear transformation it is easy to verify that the eigenvalues 
must be real numbers, even in the complex case. 

This is the computation used in a standard proof of the fact that self-adjoint 
linear operators on R" or C" can be diagonalizcd in an orthonormal basis. In 
other words, if >S is a self-adjoint linear transformation on R" or C", then there 
are eigenvectors Ui ,...,«„ for 5 which are orthonormal in the sense that 

(1.15) {Vj,vi}=0 
when j ^ I and 

(1.16) {vj,vj) = l 

for each j. Let us also mention that if 5 is a self-adjoint linear transformation 
on R", C" and v is an eigenvector for S, and if w is another vector in R", C" 
which is orthogonal to v in the sense that 

(1.17) {v,w)=0, 

then S{w) is also orthogonal to v. 

A self-adjoint linear transformation T on R" or C" is said to be nonnegative 

if 

(1.18) {T{v),v)>0 

for all V in R" or C", as appropriate. This is equivalent to the condition that 
the eigenvalues of T be nonnegative real numbers. If Ti, T2 are nonnegative 
self-adjoint linear transformations on R" or on C" and ri, r2 are nonnegative 
real numbers, then 

(1.19) riTi+raTa 

is also a nonnegative self-adjoint linear transformation. 

A linear transformation A on R" or C" is said to be invertible if there is 
another linear transformation B on R" or C", as appropriate, such that 

(1.20) AoB = BoA = I. 

It is easy to check that if i? is a mapping on R" or C" which is the inverse of 
A as a mapping, then B must also be linear, so that A is invertible as a linear 
mapping. The inverse of a linear transformation A is unique when it exists, and 
is denoted A^^. 

The kernel of a linear transformation A on R" or C" is the set of vectors 
V in R" or C", as appropriate, such that A{v) = 0. The kernel of a linear 
transformation is automatically a linear subspace, which means that it contains 
the vector 0, the sum of two elements of the kernel again lies in the kernel, and 
any scalar multiple of a vector in the kernel is also an element of the kernel. 
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The kernel of a linear transformation is said to be trivial if it contains only the 
vector 0. 

If a linear transformation is invertible, then its kernel is trivial. Conversely, 
if A is a linear transformation on R" or C" whose kernel is trivial, then A is 
invertible. This is a well known fact from linear algebra, and similarly A is 
invertible if and only if it maps R" or C" onto itself, as appropriate. 

The statement that a linear transformation A on R" or C" is nontrivial is 
equivalent to the statement that A has a nonzero eigenvector with eigenvalue 
equal to 0. More generally, a scalar A is an eigenvalue for a linear transformation 
A if and only if the linear transformation 

(1.21) A- XI 

has a nontrivial kernel. For the record, a scalar A is considered to be an eigen- 
value of a linear transformation A only when there is a nonzero eigenvector for 
A with eigenvalue A. 

If Ai, A2 are invertible linear transformations on R" or on C", then the 
composition Ai o A2 is also invertible. In this case we have that 

(1.22) {Ai o Aiy^ = A^^ o A^^ . 

Conversely, if Ai and A2 are linear transformations on R" or on C" such that 
Ai 0A2 is invertible, then Ai and A2 are each invertible themselves, because Ai 
maps R" or C" onto itself, as appropriate, and A2 has trivial kernel. 

Suppose that Ti, T2 are linear operators on R" or on C". One can check 

that 

(1.23) (Ti 0T2)* = oTi*. 

If T is an invertible linear operator on R" or C", then T* is also invertible, 
with 

(1.24) (T*)-i = (T-^)*, 

and in particular the inverse of an invertible self-adjoint linear operator is also 
self- adjoint. 

A self-adjoint linear operator A on R" or C" is said to be positive-definite 

if 

(1.25) {A{v),v)>0 

for all nonzero vectors v. Thus a positive-definite self-adjoint linear operator is 
invertible, because it has trivial kernel, and one can check that the inverse is also 
positive-definite. Also, a self-adjoint linear transformation is positive- definite if 
and only if it is nonnegative and invertible. 

Suppose that T is any linear transformation on R" or C". Clearly T* oT is 
self-adjoint, and it is nonnegative as well. Moreover, T* o T is positive definite 
if and only if T is invertible. 

If A is a self-adjoint linear transformation on R" or C" which is positive- 
definite, and if a is a positive real number, then is also a self-adjoint linear 
transformation which is positive-definite. If Ai, A2 are two self-adjoint linear 
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transformations on R" or on C" which arc sclf-adjoint and nonncgativc, and 
if at least one of Ai, A2 is positive-definite, then the sum Ai + A2 is a self- 
adjoint linear transformation which is positive- definite. In particular, the sum 
of two sclf-adjoint linear transformations which are positive-definite is again 
positive-definite. 

A linear transformation T on R" or C" is said to be orthogonal or unitary, 
respectively, if T is invertible and 

(1.26) T"^ = T*. 
This is equivalent to saying that 

(1.27) {T{v),T{w)) = {v,w) 

for all vectors v, w in the domain. In fact, this is equivalent to 

(1.28) \T{v)\ = \v\ 

for all vectors v in the domain, as one can show using polarization. 

A linear transformation A on R" or C" is said to be anti-self-adjoint if 

(1.29) A* = -A. 

Any linear transformation T can be written as S -\- A, with S a self-adjoint 
linear transformation and A an anti-self-adjoint linear transformation, simply 
by taking 

(1.30) S = ^.. A=^. 

In the complex case a linear transformation is anti-self-adjoint if and only if it 
is i times a self-adjoint linear transformation, and in both the real and complex 
cases it can be useful to observe that the square of an anti-sclf-adjoint operator 
is self-adjoint, and in fact it is —1 times a nonnegative self-adjoint operator. 

As above, a subset L of R" or C" is said to be a linear subspace if G L, 
v,w E L implies v + w G L, and v E L implies av £ L for all scalars a, which 
is to say all real or complex numbers, as appropriate. The subspace consisting 
of only the vector is called the trivial subspace. Of course R", C" are linear 
subspaces of themselves. 

Suppose that vi,...,Vm is a finite collection of vectors in R" or C". The 
span oi v\, . . . ,Vm is denoted 

(1.31) span{wi, . . . 

and is the linear subspace consisting of the vectors of the form 

(1.32) 

where ai, . . . are scalars. A linear subspace L of R" or C" is said to be 
spanned by a finite collection of vectors vi,...,Vm G L if the span of those 
vectors is equal to L. 



6 



A finite collection wi , . . . , of vectors in R" or C" is said to be linearly 
independent if a linear combination X^Jli ctj Vj of the iij's is equal to the vector 
only when the scalars aj are all equal to 0. This is equivalent to saying that 
vectors in the span of wi , . . . , are represented in a unique manner as a linear 
combination Y^^=i '^j '^j- ^ finite collection {vi, . . . , Vm} of vectors in R" or C" 
is said to be a basis for a linear subspace L if vi, . . . ,Vm are linearly independent 
and their span is equal to L. 

A finite collection vi, . . . ,Vm of vectors in R" or C" is linearly dependent if 
there are scalars ai, . . . , am, with aj ^ for at least one j, such that 

m 

(1.33) Y,a,Vj=0. 

In this case one can reduce the collection to a smaller one with the same span, 
at least if we consider the trivial subspace to be the span of the empty collection 
of vectors. Assuming that at least one of the vectors is nonzero, we can repeat 
the process to obtain a nonempty subcoUection of vectors which is linearly 
independent and has the same span. 

A basic result from linear algebra states that if L is a linear subspace of 
R" or C" which is spanned by a collection of m vectors, then every linearly 
independent collection of vectors in L has less than or equal to m elements. 
This comes down to the fact that a system of I homogeneous linear equations 
with more than I variables always has a nontrivial solution. One can turn this 
around and say that if L contains a set of k linearly independent vectors, then 
any collection of vectors which spans L has at least k elements. 

The standard basis for R" or C" is the collection of n vectors, each of which 
has exactly one component equal to 1 and the others equal to 0. It is easy to 
see that this is a basis, which is to say that it is linearly independent and spans 
the whole space. Also, every linear subspace of R" or C" is spanned by a finite 
collection of vectors, and hence has a basis, using the empty collection of vectors 
for the trivial subspace. 

The dimension of a linear subspace of R" or C" is equal to the number 
of elements of a basis in the subspace. By the earlier remarks this number is 
the same for each basis. The dimension can also be described as the maximum 
number of linearly independent vectors in the subspace, or the minimal number 
of vectors needed to span the subspace. 

Let L be a linear subspace of R" or C" with dimension /. A collection of / 
linearly independent vectors in L also spans L, since otherwise one could add a 
vector in L not in the span of these vectors to get a collection of I + 1 linearly 
independent vectors in L. Similarly, a collection of I vectors in L which spans 
L is also linearly independent. 

Suppose that T is a linear operator on R" or C", and that L is a linear 
subspace of the same space. In this event T{L), the image of L under T, is also 
a linear subspace. If T is invertible, then the dimension of T{L) is equal to the 
dimension of L. 
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A collection of vectors vi,. .. ,Vm in R" or C" is said to be orthonormal if, 
as before, 

(1.34) {vj,Vk)=Q 
when j and 

(1.35) \vj\ = l 

for each j. If wi, . . . ,f„i is an orthonormal collection of vectors in R" or C" 
and w is in their span, so that 

m 

(1.36) w = ^ajVj 

for some scalars aj, then 

(1.37) aj = {w, Vj) 

for each j, and in particular vi, . . . ,Vm are linearly independent. Also, we have 
that 

m 

(1.38) \w\' = J2\{^,vj)f 
in this case. 

Let us recall the Cauchy-Schwarz inequality, which states that if v, w are 
elements of R" or of C", then 

(1.39) \{v,w)\<\v\\w\. 
This can be shown using the fact that 

(1.40) {v + aw,v + aw) = \v + awl"^ > 

for all scalars a. Using this inequality, one can also show that 

(1.41) |t; + w| < \v\ + \w\, 

which is to say the triangle inequality. 

As before, if v, w are two vectors in R" or C", then we say that v, w are 

orthogonal if 

(1.42) {v,w)=0, 

and in this case we write v J- w. If v, w are orthogonal vectors in R" or in C", 
then 

(1.43) \v + w\^ = \v\^ + \w\\ 

and av f3w for all scalars a, /?. Conversely, notice that if v, w are two vectors 
in R" such that \v + w\'^ = Iv]"^ + then v -L w, and if v, w are two vectors 
in C" such that |t; + Q!w|^ = + \w\^ for all complex numbers a with \a\ = 1, 
then V -L w. 
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Suppose again that fi is an orthonormal collection of vectors in R" 

or C". If u is any vector in R" or C", then 

m 

(1.44) u' = ^ 

3=1 

lies in the span of Vi,. . . ,Vm, and one can check that tt — tt' is orthogonal to 
every vector in the linear span of vi, . . . ,Vm- In particular, 

(1.45) {u-u',u')=0, 
and 

(1.46) \uf = \u-u'f + \u'f. 

If T is a linear transformation on R" or C" , then the trace of T is denoted 
trT and is defined to be the sum of the diagonal terms in the standard matrix 
associated to T. To be more explicit, let 6i, . . . , Cn denote the standard basis 
for R" or C", as appropriate, so that Cj has jth component equal to 1 and all 
other components equal to 0. The trace of a linear transformation T can then 
be expressed as 

n 

(1.47) trT = ^(r(e,),e,). 

The trace is clearly linear in T, so that if Ti, T2 are linear transformations 
on R" or on C" and ai, a-i are scalars, then 

(1.48) tr(airi+a2r2) =ai trTi+aa trTa. 
Another fundamental property of the trace is that 

(1.49) tr(TioT2)=tr(T2oTi) 

for all linear transformations T\, T^. This can be verified in a straightforward 
manner. 

If T is a linear transformation on R", then 

(1.50) trT* = trT, 
while if T is a linear transformation on C", then 

(1.51) trr* = tFT. 

If A, B are linear transformations on R" or C", then 

n n 

(1.52) tr(S* o^) = J2{B*{A{ej)),ej) = ^^(^(e,), B(e,)). 

where as usual ei, . . . , e„ denotes the standard basis for R" or C". This is the 
same as 

n n 

(1.53) ^^(A(e,),e;>(i?(e,),eO 

j=i ;=i 
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in the real case and 

n n 

(1.54) =EE<^(^^-).e;)(S(e,-),e;) 

in the complex case, which is to say that one takes the standard matrices of A, 
B, views them as elements or R" or C" , as appropriate, and then takes the 
usual inner product. 

In particular, if T is a linear transformation on R" or C", let ||T||//5 be the 
nonnegative real number defined by 

n n 

(1.55) \\T\\%s = tr(r* o T) = ^ ^ |(r(e,), 6^)^. 

j=i fe=i 

In other words, is the same as the usual Euclidean norm of the standard 

matrix associated to T, and it is also known as the Hilbert-Schmit norm of T. 
Observe that \\T\\hs = if and only if T = 0, \\aT\\HS = \a\ \\T\\hs for all 
scalars a and all linear transformations T, UTi +T2||/fs < + ||T2||i/s for 

all linear transformations Ti, T2, and that 

(1-56) |tr(T2*oTi)| < \\T,\\hs\\T2\\hs 

for all linear transformations Ti, T2. 

If T is a linear transformation on R" or C", then the operator norm of T 
is denoted llTjl op and defined to be the max;imum of \T{v)\ over all vectors v in 
the domain with \v\ = 1, which exists by the extreme value theorem in calculus. 
In other words, 

(1.57) \T{w)\<\\T\Up\w\ 

for all vectors w, and ||T||op is the smallest nonnegative real number with this 
property. One can check that ||T||op = if and only if T = 0, HaTHop = 
|a| llTjlop for all scalars a and all linear transformations T, \\Ti + T2\\op < 
11^1 Hop + 11^2 Hop for all linear transformations Ti, T2, and 

(1-58) ||TioT2||„j,< ||Ti||op||T2||oj, 

for all linear transformations Ti, T2. 

Alternatively, ||T||op can be described as the maximum of \{T{v), iv) \ over all 
vectors v, w in the domain such that \v\ = \w\ = 1, which is the same as saying 
that 

(1.59) \{T{v),w)\<\\T\\op\v\\M 

for all vectors v, w in the domain, and that ||r||op is the smallest nonnegative 
real number with this property. In particular, it follows that 

(1-60) \\T*\\op^\\T\\op 

for all linear transformations T. It is easy to check as well that 

(1-61) \\T*\\hs = \\T\\hs 
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for all linear transformations T. 
Notice that 

(1-62) KT(e,-),efc)|<||T|U^, 

so that the operator norm of T is greater than or equal to the absolute values 
of the entries of the standard matrix associated to T. One can express ||T||jif5 
by 

n 

(1-63) imii,s = Ei^(^^)i'' 

from which it follows that IITjljirsf < n^/^ ll^llop- Prom this formula it also 
follows that \\A o B\\hs < \\A\\op \\B\\hs, and similarly one has \\A o -BUhs < 
||-B||op for all linear transformations A, B. 
Suppose that Ui, is an orthonormal collection of vectors in R" or C™, 

and let L denote the span of this collection. As we have seen, if u is any vector 
in R" or C", as appropriate, then there is a vector u' £ L such that u — u' is 
orthogonal to every element of L. These two properties characterize u', since 
if u" G L and u — u" is orthogonal to every element of L, then u' — u" is an 
element of L and is orthogonal to every element of L, including itself, so that 
u' - u" = 0. 

In this situation let us write for the linear transformation on R" or C", 
as appropriate, which sends u to -u! . This is called the orhogonal projection of 
R" or C", as appropriate, onto L. It is uniquely determined by L, which is to 
say that it does not depend on the choice of orthonormal basis for L. 

Using these orthogonal projections, one can show that every orthonormal set 
of vectors in R" or C" can be extended to an orthonormal basis, and that every 
linear subspacc of R" or C" has an orthonormal basis. This is basically the same 
as the Gram-Schmit process, in which a collection of vectors is orthonormalized 
one step at a time. In particular, for every linear subspace L of R" or C" 
there is a corresponding orthogonal projection P^, which one can also check is 
self- adjoint. 

Let vi,. . . ,Vn and wi, . . . , w„ be orthonormal bases of R" or of C", respec- 
tively. If T is a linear transformation on R" or on C", as appropriate, then one 
can check that 

n 

(1-64) \\TrHs = J2\T{vj)\' 

and that 

n n 

(1-65) \\TfHs = T.T.\(nvj),Wk)\'. 

j=i k=i 

In particular, it follows that ||T||op < ||T||i/s. 

Suppose that i is a nontrivial linear subspace of R" or C", and that Pl 

is the corresponding orthogonal projection onto L. For each vector ?i in the 
domain, we have that -Pl(w) and u — Phiu) are orthogonal to each other, so 
that 

(1.66) \Pl{u)\'' + \u-Pl{u)\^ = \u\\ 
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and one can check that ||Pl||op = 1- From the remarks in the previous para- 
graphs it follows that ||-Pl||//s is equal to the square root of the dimension of 
L. 

In general, a projection on R" or C" is a linear operator P which is an 
"idcmpotent" , which means that 

(1.67) P2 ^ p 

Thus for instance the identity and the operator are projections, and in general 
if P is a projection and L is the image of P, so that L consists of the vectors of 
the form P{v) for vectors v in the domain, then L is exactly the set of vectors 
w such that P{w) = w. If P is a projection and v is any vector in the domain, 
then P{v) lies in the image of P and v — P{v) lies in the kernel of P. 

If i is a linear subspacc of R" or C" , then the orthogonal com,plern,ent of L 
is denoted and defined to be the linear subspace of vectors v such that v is 
orthogonal to w for all w G L. Prom the earlier remarks it follows that every 
vector u in R" or C", as appropriate, can be written in a unique way as the 
sum of vectors in L and L^. One can also check that (i^)"*" = L. 

A projection P on R" or C" with image L is equal to the orthogonal pro- 
jection onto L if and only if the kernel of P is equal to L-^. Also, a projection P 
is an orthogonal projection if and only if P is self-adjoint. The operator norm 
of a nonzero projection is automatically greater than or equal to 1, and one 
can check that it is equal to 1 if and only if the projection is an orthogonal 
projection. 

Now let us briefly review some aspects of determinants. We begin with 
some facts about permutations. Fix a positive integer n, and let Sym(n) de- 
note the symmetric group on {1, . . . ,n} consisting of the permutations on the 
set {1, . . . ,n} of positive integers from 1 to n, which is to say the one-to-one 
mappings from this set onto itself, with composition mappings as the group 
operation, and inverses of mappings as inverses in the group. 

A transposition is a permutation r on {l,...,n} which interchanges two 
elements of the set and leaves the others fixed. A basic fact is that every element 
of the symmetric group can be expressed as a composition of finitely many 
transpositions. Of course such a product is not unique, and another important 
result is that the parity of the number of transpositions used is unique, i.e., it 
depends only on the original permutation. 

In effect this is the same as saying that the identity permutation, which fixes 
all elements of the set, can be expressed as a composition of an even number 
of transpositions, and not an odd number of transpositions. An element of 
the symmetric group is said to be even or odd according to whether it can 
be expressed as the composition of an even or odd number of transpositions. 
The composition of two even permutations is even, the composition of two odd 
permutations is an odd permutation, the composition of an even and an odd 
permutation is an odd permutation, and the inverse of a permutation w has the 
same type as tt does. 

Now let A be a linear transformation on R" or C", and let {aj^i) denote the 
corresponding n x n matrix of real or complex numbers. The determinant of A 
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is denoted det A and is the real or complex number, respectively, given by 



(1.68) 



det^ = 



sign(7r) ai,^(i) a2,^(2) ■■■a, 



'n,7r(rt) ? 



7rGSym(n) 



where sign(7r) is equal to +1 or —1 according to whether the permutation tt is 
even or odd. Thus the determinant of A is a homogeneous polynomial of degree 
n as a function of the entries of the matrix (aj^i). 

When n = 1, the matrix associated to A is really just a single number, 
and the determinant of A is that number. In general we have that det I = 1, 
det A* = det A for all A, and 



for all linear transformations A. B on R" or on C". It follows from this that if 
A is an invertible linear transformation, then dot A =^ and indeed 



and conversely there is the well-known Cramer's rule, which states that a linear 
transformation with nonzero determinant is invertible, with a formula for the in- 
verse in terms of the determinant of the linear transformation and determinants 
of submatrices of the associated matrix. 

Let vi,. . . ,Vn be a basis for R" or C", and let A be a linear transformation 
on R" or C". It is easy to see that A is uniquely determined by its values on 
Vi, . . . ,Vm and conversely that if . . . , is any other collection of n vectors 
in R" or C", as appropriate, then there is a linear transformation A such that 
A{vj) = Wj for each j. Also, A is an invertible linear transformation if and only 
if A{vi), . . . , A{vn) is a basis too. 

For any choice of basis for R" or C", there is a natural correspondence 
between linear transformations on R" or C" and matrices with real or complex 
entries, respectively, in such a way that the diagonal matrices correspond exactly 
to linear transformations for which the vectors in the basis are eigenvectors. Of 
course for any two choices of bases there is an invertible linear transformation 
which sends one basis to the other. For a single linear transformation, one gets 
two matrices associated to the two bases, and these two matrices are related by 
conjugation. 

In particular, for a linear transformation A on R" or on C" and a choice of 
basis Vi, . . . ,Vn, one gets a matrix associated to this linear transformation and 
basis, and one can take the trace or determinant of this matrix. It follows firom 
the basic identities for the trace and determinant that the trace of this matrix is 
the same as for the matrix associated to any other choice of basis. As a special 
case, if the linear transformation is diagonalizable, in the sense that there is a 
basis of eigenvectors, then the trace is the same as the sum of the corresponding 
n eigenvalues, and the determinant is equal to the product of the eigenvalues. 

As another basic example, if P is a projection on R" or on C" whose image 
is a linear subspace L of dimension /, then the trace of P is equal to I. 



(1.69) 



det{AoB) = (det A) (det B) 



(1.70) 



(detA)-i = det(A-i), 
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Now let us look at exponentials, beginning with exponentiation of real num- 
bers. The exponential function is denoted exp(a;) and can be defined by the 
series expansion 

(1.71) exp(a.) = $:J. 

Here a;" is interpreted as being equal to 1 when n = 0, even if a: = 0, and n! 
is "n factorial" , the product of the positive integers from 1 to n, which is also 
interpreted as being equal to 1 when n = 0. 

By standard results, this series converges for all a; G R, and converges ab- 
solutely, and it also converges uniformly on bounded subsets of R. The sum 
defines a real- valued function on the real line which is continuous and has con- 
tinuous derivatives of all orders, with the derivatives being given by the series 
obtained by differentiating this one term by term. In this case we have the 
well-known identity 

(1.72) cxp'(a;) = exp(a;), 

i.e., the derivative of the exponential function is itself. 
A related identity is 

(1.73) exp(.T + y) ~ cxp(.i;) cxp(y). 

Formally this can be derived by multiplying the series for exp(a;) and exp(?/), 
group terms of total degree n, and using the binomial theorem to identify them 
with the terms of exp(a;-|-i;). Convergence issues can be handled using absolute 
convergence of the series involved, by standard arguments. 

Clearly exp(a;) > 1 when a: > 0. From the multiplicative identity it follows 
that exp(a:) ^ for all a; e R, and in fact that 

(1.74) exp(-a;) = 

exp(a;) 

It follows that exp(a;) > for all a; G R, and hence the derivatives of exp(a;) are 
all positive as well, so that exp(a:) is strictly increasing and strictly convex in 
particular. 

Next we consider complex numbers. That is, we define exp(2) for 2; S C by 
the same series as before, namely, 



(1.75) ^M^) = J1^: 



n=0 



This series converges absolutely for &\\ z G C, it converges uniformly on bounded 
subsets of C, and it is continuously differentiable of all orders. 
Again we have the identities 

(1.76) exp'(^) = exp(2;) 
and 

(1.77) exp(2 -\-w) = exp(2:) exp('u;) 
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for all z,w G C. The meaning of the differential equation for exp(z) is that 
exp(2;) is a holomorphic function of z whose complex derivative is equal to 
exp(2;). To put it another way, the differential of exp(2;) at a point z is given by 
multiplication by exp(2;), so that 

(1.78) exp(2; + h) = exp{z) + exp{z) ■ h + 0{h^). 
li z = X + iy, with x, y e R, then 

(1.79) exp(z) = cxp(a;) (cos(i;) + i sin(y)). 

This is a well-known and striking formula, which can be seen by writing out 
the series expansions for the real and imaginary parts of exp(iy) and comparing 
them with the usual scries expansions for the cosine and sine. Also, as a complex- 
valued function of a real variable, we have that 

(1.80) exp(iy) = i exp(iy) 

ay 

and hence 

(1.81) exp(iy) = - exp{iy), 

which correspond to standard formulas for the derivatives of the cosine and sine, 
including the second-order differential equations that they satisfy. 
It is clear from the series expansion that 



(1.82) exp(^) = exp(^) 

for all z & C. In particular, if z = x + iy with x,y gH, then 

(1.83) I exp(z)| = exp(a;). 
The special case 

(1.84) |exp(iy)| = l 

for all 2/ G R corresponds to the usual identity cos(?/)^ + sin(j/)^ = 1. 

Fix a positive integer n, and suppose that A is a linear transformation on 
R" or on C". We would like to define exp{A) by the series 

(1.85) exp(A) = ^— , 

where now A'' denotes the fc-fold composition of A as a linear transformation, 
interpreted as being the identity operator / when fc = 0. The convergence of 
this series can be defined in terms of the convergence of the entries of the cor- 
responding matrices, and as before we have absolute convergence for all linear 
transformations A, uniform convergence on bounded sets of such linear trans- 
formations, and that the exponential defines a continuous function from linear 
transformations to themselves which is continuously differentiable of all orders. 
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A convenient way to look at absolute convergence of series of linear trans- 
formations is in terms of convergence of the corresponding series of operator 
norms. In this case we have such convergence, because 

(1-86) E^^E^- 

fe=o ■ k=0 

In particular, note that 

(1.87) \\eMA)\\op<eM\\Mop), 

where the right side refers to the exponential of the operator norm of A as a 
real number. 

If A and B are linear transformations on R" or on C" which commute, then 
we still have that 

(1.88) exp(^ + B) = cxp(A) o exp(B), 
for essentially the same reasons as before. Of course 

(1.89) exp(O) = /, 

and for any linear transformation A we have that exp(^) is invertible, with 

(1.90) (exp(A))-i = exp(-A) 
and thus 

(1.91) \\ieMA)r'\\op<e^v\\A\\op. 
Furthermore, 

(1.92) (exp(^))* = exp(A*), 

which is to say that the adjoint of the exponential of A is equal to the exponential 
of the adjoint of A. 

If A is a linear transformation on R" or on C", and if T is another linear 
transformation on R" or on C", as appropriate, which commutes with A, then 
T also commutes with exp(A), and the directional derivative of exp(A) at A in 
the direction of T is given by multiplication by exp(A), so that 

(1.93) exp(^ + T) = exp(A) + exp(^) T + 0{\\Tfj. 

In particular, if A is a linear transformation on R" or on C" and we put 

(1.94) EA{t) = eMtA), 

viewed as a function on the real line with values in linear transformations on 
R" or on C", as appropriate, then this function is continuously differentiable 
of all orders and satisfies 

(1.95) j^EA{t)=AoEA{t), Ea{Q) = I. 



16 



These conditions characterize EA{t) uniquely, by standard results about ordi- 
nary differential equations. 

What about the determinant of the exponential of a linear transformation? 
Notice first that the differential of the determinant as a function on linear trans- 
formations on R" or on C" and evaluated at the identity transformation is given 
by the trace. That is, if T is any linear transformation on R" or on C", then 

(1.96) det(/ + T) = l + trr + 0(||r||2p), 

and of course this is just a simple algebraic statement, since the determinant of 
a linear transformation A is a polynomial in the entries of the matrix associated 
to A. 

This implies that 

(1.97) ^ det(exp(f ^)) = (tr^) det(exp(t A)). 

More precisely, at t = this follows exactly from the remarks of the preceding 
paragraph. In general, for any two real numbers r, s, we have that 

(1.98) exp((r + .s) A) = exp(r A) o cxp(.s A), 

and this permits one to derive the formula for the derivative at any real number 
t from the special case of t = 0. 

Of course the determinant of exp(tyl) at t = is equal to 1, and it follows 
that 

(1.99) det(exp(t A)) = exp(t tvA). 

The trace of A is a real or complex number, and the right side is the usual 
exponential of a scalar. We may as well apply this to t = l and say that 

(1.100) det(exp(A)) = exp(trA). 

Suppose that A is a linear transformation on R" or on C" and that v is a. 
vector in R" or in C", as appropriate. Set 

(1.101) h{t) = eicp{t A){v) 

viewed as a function from the real line into R" or C", i.e., where for each t we 
let h{t) be the image of v under exp(i A). As before we have that h'it) — A{h{t)) 
and that h{Q) = 0, and h{t) is characterized by these properties by standard 
results about ordinary differential equations. 

Assume further that v is an eigenvector for A with eigenvalue A, so that 

(1.102) A{v) = \v, 
where A is a scalar. In this case 

(1.103) e-K^{tA)(v) = eiC£,{t\)v, 
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where cxp(t A) is the usual exponential mapping for scalars. One can see this 
either from the series expansion for the exponential or from the characterization 
in terms of ordinary differential equations. 

It may be that A is diagonalizable, so that there is a basis of eigenvec- 
tors for A. The elements of this basis are then eigenvectors for exp{tA) too, 
with the eigenvalues for the exponential being given by the exponentials of the 
corresponding eigenvalues, as in the previous paragraph. In other words, the 
exponential is then also diagonalizable, and by the same basis as for A itself. 

For that matter, suppose that A is a linear transformation on R" or on C", 
and that L is a linear subspace of the same space which is invariant under A. 
This means that 

(1.104) A{L) C L, 

which is to say that A{v) G L for all v G L. In this event L is invariant under 

exp(t^) for all t as well, as one can see from either the series expansion for the 
exponential or the characterization in terms of ordinary differential equations. 

Next we review some aspects of spectral theory of matrices. If A is a linear 
transformation on C", then the characteristic polynomial associated to A is 
defined by 

(1.105) QAiz) =det{zl - A). 

Thus qA{z) is a polynomial of degree n whose leading coefficient is equal to 1 
and which vanishes exactly at the eigenvalues of A. 

The fundamental theorem of algebra states that every nonconstant polyno- 
mial on the complex numbers has a root. As a result, every linear transformation 
on C" has at least one eigenvalue. Recall as well that every nonconstant poly- 
nomial on the complex numbers can be factored as a nonzero complex number 
times a product of linear factors of the form {z — a), a G C. 

If p{z) is a polynomial, 

(1.106) p{z) = CmZ'^ + Cm-l ^"-^ + ■ • • + Cq, 

Co, ■ ■ ■ ,Cm € C, and A is a linear transformation on C", then we can define p{A) 
to be the linear transformation on C" given by 

(1.107) p{A) =CmA"' + Cm-l A"-' + ■ • • + Co /. 

Notice that if pi, p2 are polynomials, so that the sum pi +p2 and the product 
Pi p2 are also polynomials, then we have that 

(1.108) {pi+p2)iA)=pi{A)+p2iA) 
and 

(1.109) {piP2){A)=pi{A)p2{A)=p2{A)piiA). 

Moreover, the composition pi o p2 is also a polynomial, and {p\ 0^2) (^) = 

Pl{P2{A)). 

If j4 is a linear transformation on C", t; is a vector in C" which is an eigen- 
vector for A with eigenvalue A, and p{z) is a polynomial, then v is also an 
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eigenvector for p{A), with eigenvalue p{X). Conversely, if A is a linear transfor- 
mation on C" and h{z) is a polynomial such that h{X) ^ for all eigenvalues A 
of A, then h{A) is invertible. As a consequence, for a linear transformation A 
on C" and a polynomial p{z), if a complex number /j, is an eigenvalue of p{A), 
then there is an eigenvalue A of A such that p{X) = ji. 

The famous Cayley-Hamilton theorem states that for a linear transformation 
A on C" and its characteristic polynomial qA{z) as above, we have that 

(1.110) qA{A) = 0. 
It follows that 

(1.111) p(A) = 

whenever p{z) is a polynomial which can be expressed as the product of the 
characteristic polynomial qA{z) and another polynomial. This holds when p{z) 
vanishes at each eigenvalue of A, and to at least the same order as qa does. 

In particular, we have that A" can be expressed as a linear combination of 
A'^, 1 < < n — 1, and the identity operator. By repeating this, every positive 
integer power of A can be expressed as a linear combination oi A^ ^ 1 < /c < n— 1, 
and the identity operator. To put it another way, for each polynomial p{z) there 
is a polynomial p{z) of degree at most n — 1 such that p{A) = p{A). 

Also, the exponential of A can be expressed as p{A) for a polynomial p{z). 
It is enough to choose p{z) so that it agrees with the exponential function at 
the eigenvalues of A, and to sufficiently high order. Notice in particular that 
the eigenvalues of exp(j4) are therefore all exponentials of eigenvalues of A. 

We know that the exponential of a linear transformation is automatically 
invertible. Conversely, if B is an invertible linear transformation on C", is 
there a linear transformation A on C" such that exp(j4) = Bl The answer is 
yes, and indeed one can take A = p{B), where p{z) is a polynomial on C which 
satisfies exp{p{z)) = 2; at the eigenvalues of B, and to sufficiently high order. 

Now let us consider the real ease. We have seen that the determinant of the 
exponential of a linear transformation on R" is equal to the exponential of the 
trace of that linear transformation, and hence is a positive real number. This 
is a simple necessary condition for an invertible linear transformation on R" to 
be the exponential of another linear transformation. 

Let A be any linear transformation on R", and let A denote the unique 
linear transformation on C" which a^ree with A on R". To be more precise, A 
is complex-linear, so that A{i v) = i A{v), and this ensures that A is determined 
by its action on vectors with real coordinates. Also, A and A are associated to 
the same n x n matrix with real entries, with respect to the standard bases of 
R" and C", respectively. 

For any polynomial p{x) with real coefficients, we can define p{A) in the 
usual manner, and it has the same basic properties as before. We can also think 
of p as a complex polynomial and consider p{A), and it is easy to see that this 
is the same as the complex- linear transformation on C" induced by p{A). In 
other words, 

(1.112) p{A)=p{A). 
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If A is a real number which is an eigenvalue of A, then A is also an eigenvalue 
of A, using the same eigenvector in fact, and conversely if A is an eigenvalue of 
A which is a real number too, then one can check that A is an eigenvalue for A. 
However, in general there can be complex eigenvalues for A, and one can check 
that if A is an eigenvalue of A, then so is the complex conjugate A, and with the 
same multiplicity as a zero of the characteristic polynomial q^- Notice that for 

X real the characteristic polynomial q^ix) of A is the same as the real version 

for A, 

(1.113) q2{x)=det{xI-A), 

and in particular the two polynomials have the same coefficients, which are real 
numbers. 

Suppose that B is an invertible linear transformation on R". If, for instance, 
B has no eigenvalues which are negative real numbers, then there are polyno- 
mials p{z) with real coefficients such that exp(p(z)) = 2; to whatever order one 
might like at the eigenvalues of B. Consequently, B = exp{A) with A = p{B), 
and where A is a linear transformation on R". 

This is certainly not the whole story however. Let us mention two basic 
examples of linear transformations with positive determinant and negative real 
eigenvalues which can and which cannot be represented as an exponential. We 
can look at this in terms of another correspondence between real and complex- 
linear transformations. 

Namely, we can identify C" with R^" in the obvious way, with the real and 
imaginary parts of the n complex components of a vector in C" being the 2n real 
components of the corresponding vector in R^" . If A is a linear transformation 
on C", let us write A° for the corresponding real-linear transformation on R^". 
Notice that 

(1.114) [ai Ai + a2 A2)° ^ ai Al + a2 A° 

when ai, a2 are real numbers and Ai, A2 are complex-linear transformations 
on C", and that {Ai ^2)° = A^. 

On R^, consider the linear transformation — /. This is a diagonalizable lin- 
ear transformation with eigenvalue —1 of multiplicity 2, and the determinant 
is equal to 1. This linear transformation is the exponential of another linear 
transformation on R^, because one can think of it as a complex-linear trans- 
formation on C, and convert the realization as an exponential there to one on 

R2. 

As a different example, suppose that i3 is a linear transformation on R^ 
such that the two standard basis vectors ei = (1, 0) and 62(0, 1) are eigenvectors 
with eigenvalues Ai, A2, respectively, and where Ai, A2 are distinct negative real 
numbers. li B = exp(A) for some real linear transformation A on R^, then 
A, B commute in particular, and it follows that A{ei), A{e2) are eigenvectors 
for B with eigenvalues Ai, A2, respectively. Hence A{ei), A{e2) should be real 
multiples of ei, 62, this leads to a contradiction. 

If A is any complex-linear transformation on C" and A° is the corresponding 
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real- linear transformation on R^", then 
(1.115) detyl° = IdetAp, 

i.e., the determinant of A° as a real-hnear transformation is equal to the absohite 
value squared of the determinant of A as a complex-hnear transformation. This 
is not too difficult to show, starting with n = 1, for instance. In particular, A° 
always has nonnegative determinant. 

There is another well-known simple trick for expressing a positive power of 
a linear transformation as a linear combination of lower powers. Namely, if T 
is a linear transformation on R" or C", then there is a positive integer k < n? 
such that T'^ is a linear combination of the identity operator and , 1 < j < k, 
simply because the vector space of linear transformations on R" or C" has 
dimension . Of course the version of this from the Cayley-Hamilton theorem 
is more precise and explicit. 

Suppose that A is a linear transformation on R", and let A denote the 
corresponding complex- linear transformation on C". Of course 

(1-116) \\A\\hs = \\A\\hs, 

since A and A correspond to the same n x n matrix of real numbers, and the 
norms in question are simply the square root of the sum of squares of these 
matrix entries. Moreover, one can check that 



(1-117) \\A\\or,= \\A\ 



opi 



where the left side refers to the operator norm of ^4 as a linear transformation on 
C", and the right side refers to the operator norm of A as a linear transformation 
on R". 
Also, 

(1.118) A* = {A*), 

where the left side is the adjoint of A as a linear transformation on C", and the 
right side is the complex-linear transformation on C" induced by the adjoint of 
A as a real-linear transformation on R". It follows that if A is an orthogonal 
linear transformation on R", then A is a unitary linear transformation. One 
can see this as well using the fact that a real or complex-linear transformation 
is orthogonal or unitary, respectively, if and only if it is invertible, has operator 
norm equal to 1, and its inverse has operator norm equal to 1. 

If A is an orthogonal or unitary linear transformation on R" or C", respec- 
tively, then 

(1.119) |detA| = l. 
Indeed, in this case we have that 

(1.120) 1 = detJ = dei{AA*) = (det A)(dct A*) = \ detA\^. 
Alternatively, one can show this using the fact that 

(1.121) |detA|<||ACp. 
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More precisely, if ^ is a linear transformation on C", then det^ is the 
product of the eigenvalues of A, according to their multiplicities as zeros of the 
characteristic polynomial of A, and it is easy to see that 

(1.122) |A| < \\A\\op 

for each eigenvalue of A. In the real case one can apply this argument to the 
induced complex- linear transformation on C", which has the same determinant. 
As another argument, it is enough to check that 

(1.123) |det^|<l when ||A||op < 1, 

and for this one can observe that the sequence of linear transformations A'', 
A; > 1, is bounded when ||^|| op < 1, and hence that their determinants are 
bounded, and hence that the scalars (detA)'' are bounded, which implies that 
Idet^l < 1. 

Suppose that A is a linear transformation on R" or on C" which is anti-self- 
adjoint, so that 

(1.124) A* = -A. 

In this case exp(^) is an orthogonal or unitary transformation, as appropriate. 
The adjoint of exp(A) is equal to exp{A*), which is the same as exp(— ^4) in this 
case, which is the inverse of exp(A). 

Let us consider next the question of when an orthogonal linear transforma- 
tion on R" or a unitary transformation on C" can be expressed as the expo- 
nential of a self-adjoint linear transformation. To do this we digress a bit for 
some general matters about linear transformations. We begin with the complex 
case. 

A linear transformation T on C" is said to be normal if T commutes with 
its adjoint, which is to say that 

(1.125) rp*rp^J,J.* 

We can write any linear transformation T on C" as Ti -|- i T2, where Ti, T2 are 
the self-adjoint linear transformations given by 

(1.126) ri = i(r + r*), r2 = l(T-T*), 

and the condition of normality is equivalent to saying that Ti, T2 commute. 
Note that unitary transformations are normal. 

We already know that if B is a self-adjoint linear transformation, then there 
is an orthonormal basis of the underlying vector space consisting of eigenvectors 
of B. Given two self-adjoint linear transformations which commute, one can find 
an orthonormal basis consisting of vectors which are eigenvectors for both linear 
transformations. Conversely, for a fixed basis, any two linear transformations 
for which vectors in the basis are eigenvectors clearly commute with each other. 

As a result, if T is a normal linear transformation on C", then there is an or- 
thonormal basis of C" consisting of eigenvectors of T. In particular this applies 
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to unitary transformations, for which the corresponding eigcnvahies are complex 
numbers with modulus 1. As a result, if [/ is a unitary linear transformation on 
C", then there is an anti-self-adjoint linear transformation A on C" such that 
exp(A) = U, and indeed one can take A to be diagonalized by the same basis 
as for U, with imaginary eigenvalues. 

Now let us consider the real case. For this we cannot use the trick of writ- 
ing an anti-sclf-adjoint linear transformation as "i" times a self-adjoint linear 
transformation. There are other things that we can do, however. 

Thus we let A be an anti-self-adjoint linear transformation on R". Notice 
that 

(1.127) {A{v),v)=0 

for all vectors v € R", and that in particular a nonzero vector v in R" is an 
eigenvector for A only if the corresponding eigenvalue is equal to 0, so that v 
lies in the kernel of A. Also, for each vector w in R", A{'w) is orthogonal to 
every vector in the kernel of A. 

A basic trick to study an anti-self-adjoint linear transformation A is to con- 
sider A"^, which is self-adjoint and has the same kernel as A does. If u is a vector 
in R" which is an eigenvector with eigenvalue A, then A{v) is an eigenvector 
for A"^ with eigenvalue A too, and of course A^{v) is a multiple of v. As a result, 
if A is a negative real number which is an eigenvalue of A^, then one can check 
that the corresponding eigenspacc 

(1.128) {veR"- :A'^{v)=Xv} 

has even dimension. 

If T is an orthogonal linear transformation on R", then we can write T = 
T1+T2, where Ti = {T + T*)/2 is self-adjoint, T2 = (T - T*)/2 is anti-self- 
adjoint, Ti, T2 commute, and 

(1.129) (Ti(w),T2(v)) =0 

for all V e R". If Ai, A2 are eigenvalues of Ti, such that the joint eigenspace 

(1.130) {w e R" : Ti{v) = Ai v, T^{v) = A2 v} 

is nontrivial, then either A2 = and Ai = ±1, or A2 < 0, A^ — A2 = 1, and 
the joint eigenspace has even dimension. One can show that the parity of the 
number of times that Ai = — 1 is even or odd according to whether detT is 1 or 
— 1, and that when detT = 1, there is an anti-self-adjoint linear transformation 
A on R" such that T = exp(^). 

If ^ is a self-adjoint linear transformation on R" or on C", then cxp(A) is 
also self-adjoint, and in fact exp(A) is positive-definite, because exp(A) — 
where B is the self-adjoint linear transformation e-^){A/2). Conversely, every 
self-adjoint positive-definition linear transformation P on R" or on C" can be 
realized as exp(A) for a self-adjoint linear transformation A. This can be seen 
using an orthonormal basis of eigenvectors for P, and indeed one can choose A 
so that the vectors in the same basis are eigenvectors for A. 
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Actually, if A is a sclf-adjoint linear transformation on R" or on C", then 
there is an orthonormal basis of eigenvectors for A, and of course these same 
vectors are eigenvectors for exp(^), with the eigenvalues for exp(^) being the 
exponentials of the corresponding eigenvalues for A. As a result, for a given sclf- 
adjoint positive-definite linear transformation P on R" or C", the self-adjoint 
linear transformation A on R" or C", respectively, such that exp(^) = P is 
unique. This is analogous to the situation for the ordinary exponential function 
on real numbers, while in the complex case one can have different numbers or 
linear transformations whose exponentials are equal to each other. 

There is a natural mapping from the group of invertible linear transforma- 
tions on R" or on C" onto the self-adjoint positive-definite linear transforma- 
tions on the same space, given by 

(1.131) T^TT*. 

If T is an invertible linear transformation on R" or on C" and R is an orthogonal 
or unitary linear transformation on the same space, as appropriate, then T and 
T R arc sent by the mapping just defined to the same positive- definite linear 
transformation, since 

(1.132) {TR){TR)* =TRR*T* =TT*. 

Conversely, if T, T' are invertible linear transformations on R" or on C" such 
that T' (T')* =TT*, then there is an orthogonal or unitary linear transforma- 
tion R, as appropriate, such that 

(1.133) T' = TR. 

Also, every self-adjoint positive-definite linear transformation P on R" or on 
C" arises this manner, and in fact can be written in a unique manner as for a 
self-adjoint positive-definite linear transformation Q. If A is an invertible linear 
mapping on R" or on C", then we get a nice action of A on the self-adjoint 
positive-defimte linear transformations by the formula 

(1.134) P^APA*. 

If Ai and A2 arc two invertible linear transformations on R" or on C" such that 
Ai P Al — A2P A2 for all self-adjoint positive-definite linear transformations P, 
then Ai = A2, and if Pi, P2 are two self-adjoint positive-definite linear trans- 
formations on R" or on C" , then there is an invertible linear transformation A 
on the same space such that P2 — APiA*. 

2 Spaces of matrices 

As before, we write £(R"), £(C") for the spaces of real and complex-hnear 
mappings on R" and C", respectively. We also write GL(R"), GL(C") for the 
general linear groups of invertible real and complex-linear transformations on 
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R", C", respectively. We can identify £(R"), C{C) with R" , C" using the 
standard correspondence between linear transformations and matrices, and in 
this way we have that GL(R"), GL{C"') are open subsets of £(R"), >C(C"), 

respectively. 

The determinant can be viewed as a homogeneous polynomial of degree n 
on £(R"), >C(C"), and GL(R"), GL(C") can be described as the subsets of 
£(R"), £(C") defined by the condition 

(2.1) detT^O. 

At the identity operator, the differential of the determinant can be identified 
with the trace, since 

(2.2) (4det(J + r^)) =tr:A 

for any linear transformation A on R" or C". If T is an invcrtiblc linear 
transformation on R" or on C" and A is another linear transformation on the 
same space, then the differential of the determinant at T in the direction A can 
be expressed as 

(2.3) d(det)T(^) = (detT) tr(r-i A), 
since 

(2.4) ( det{T + r A)) = {det T) ( det{I + rT-^ A)) 

V"'' /r=0 V"'' /r=0 

(2.5) = (detT)tr(T-^A). 

We write 5L(R"), S'L(C") for the subgroups of GL(R"), GL(C") consisting 
of linear transformations with determinant equal to 1 . These are nice subman- 
ifolds of GL(R."), G'L(C"), because the differential of the determinant is not 
equal to at any point in SL{W), 6'L(C"), or in GL(R"), ^1.(0"), for that 
matter. Also we have the maps 

(2.6) Th^ (detr)-i/"r 

from invertible linear transformations on R", C" to linear transformations with 
determinant 1, at least if we restrict our attention to T's with detT > in the 
real case and T's with det T in a nice region in C that contains 1 and on which 
^-i/ra defined in the complex case. 

If T is an element of GT(R") or GL(C"), then the space of tangent vectors 
to GT(R") or GL(C") at T, as appropriate, can be identified with £(R") or 
>C(C"), as appropriate, since GT(R"), GL(C"') are open subsets of £(R"), 
£(C"), respectively. If T is an element of 5'L(R") or S'L(C"), then the space of 
tangent vectors to 5L(R") or 6'L(C") at T is equal to the space of A in £(R") 
or £(C") such that 

(2.7) tT{T-^A)^0, 

as appropriate. In other words, these are the tangent vectors to GL(R") or 
GT(C") at T, respectively, which lie in the kernel of the differential of the 
determinant function at T. 
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An important mapping on GL(R") or GL(C") is the one defined by 

(2.8) F{T) = T-\ 

and which also sends ^^(R") or 5'L(C") to itself, as appropriate. For each 
invertible linear transformation T the differential of this mapping at T can be 
expressed as 

(2.9) dFriA) = -T-^ AT-\ 



which is to say that 

(2.10) 

Indeed, 



{Z^Il^rA)-^ = {I + rT-^A)-^T-^ 

= {I-rT-'^A + 0(r^)) T"^ = T"^ - r T"^ AT"^ + Oir"^). 

If T is an invertible linear transformation on R" and A, B are linear trans- 
formations on R", then set 

(2.12) {A,B)t ^iT{T-^ AT^^ B). 

This is a symmetric bilinear form in A, B for each T, which is to say that 
{A, B)t is a linear function of A for each B and T, a linear function of B for 
each A and T, and that in fact 

(2.13) {A,B)t = {B,A)t 

for all A, B, T, so that linearity in A and B are equivalent to each other. 
Moreover, this bilinear form is nondegeratc, which means that for each T and 
for each A^ there is a B such that {A, B)t ^ 0. 

In the complex case, if T is an invertible linear transformation on C" and 
A, B are linear transformations on C", then 

(2.14) tr(T-MT-iB) 

is a complex-valued symmetric bilinear form in A, B for each T which is non- 
degenerate. To get a real-valued quantity, we set 

(2.15) (A-B)t = Retr(T-iylT-iB), 

i.e., we take the real part of the trace. This is still real-bilinear, which means 
that it is real-linear in each of A and B, and symmetric and nondegenerate. 

Notice that {A,B)t depends smoothly on T for T in G'i(R") or G'i(C"), 
as appropriate. As a result, (A, B)t is said to define a semi-Riemannian struc- 
ture, also known as a pseudo-Riemannian structure or a Ricmannian structure 
with signature, on GL(R") or GL(C"), as appropriate. On GL(C"), if we did 
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not take the real part, then we would have a holomorphic semi-Riemannian 
structure. 

At points T in 5L(R") or SL{C"), we can restrict our attention to A, B 
which are in the tangent space of S'L(R") or SL{C") at T, as appropriate. 
Explicitly, this means that wc restrict our attention to A, B such that 

(2.16) tr(T-i A) = tr(T-i B) = 0. 

This leads to scmi-Ricmannian structures on S'iv(R"), SL{C"), and a holomor- 
phic semi-Riemannian structure on ^^(C") if we do not take the real part. 

For each linear transformation Z on R" or on C", define linear transforma- 
tions Xz, pz on £(R") or on £(C"), respectively, by 

(2.17) Xz{T) = ZT, pz{T) = TZ, 

which is to say that Xz, pz correspond to left and right multiplication by Z. 
These are linear transformations, and in particular their differentials are given 

by themselves, 

(2.18) {dXzMA) = Xz{A), {dpzMA) = pz{A) 

for all linear transformations T, A on R" or on C", as appropriate. If Z is an 
invertible linear transformation on R" or on C", then Xz, pz map GL(R") or 
GL(C") onto itself, as appropriate, and if 

(2.19) detZ=l, 

then Xz, pz map SL{IU^) or ^^(C") onto itself, as appropriate. 

If Z is an invertible linear transformation on R" or on C", then the map- 
pings Xz, Pz on GL(R") or on G'L(C") preserve the semi-Riemannian structure 
(•, ■)t- In other words, if T is an element of GL(R"') or GL(C") and A, B are 
elements of £(R") or £(C"), as appropriate, which we view as tangent vectors 
to GL(R") or GL(C") at T, then 

(2.20) {{dXz)T{A), {dXz)T{B))x,(T) = {A, B)t 
and 

(2.21) {{dpz)T{A), {dpz)T{B))p,(T) = {A, B)t. 

This is easy to verify. By restriction, if dot Z = 1, so that Xz, pz can be 
viewed as defining mappings on S'L(R") or ^^(C"), as appropriate, then Xz, 
Pz preserve the restriction of our semi-Riemannian structures to ^^(R") or 
6'L(C"). 

One can also check that F{T) = preserves the semi-Riemannian struc- 
tures on GL(R"), Gi:(C"). That is, if T is an element of GL(R") or GL(C"), 
and if A, B are elements of £(R") or >C(C"), respectively, which we can view 
as tangent vectors to GL(R") or GL(C") at T, then 

(2.22) {{dFMA), {dF)TiB))p^T) = {A, B)t. 
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Also, S'L(R") and 5L(C") are invariant under F{T) — T ^, and the restriction 
of the semi-Riemannian structure to S'L(R"), S'L(C") is preserved by F, since 
this holds on GL(R"), GL(C"). 

In the complex case, let us note that Xz, pz are complex-linear transfor- 
mations, and F{T) = is a holomorphic transformation, and that they 
preserve the holomorphic version of the semi-Riemannian structure on G'L(C") 
and 5i(C"). 

Of course we can define a flat semi-Riemannian metric on >C(R") by saying 
that if T, A, B are linear transformations on R", where we think of A, B as 
tangent vectors to £(R") at T, then the inner product of these two tangent 
vectors associated to T is given by 

(2.23) iv{AB). 

In the complex case the same formula defines a holomorphic semi-Riemannian 
structure on >C(C"), and to get an ordinary semi-Riemannian structure one 
should take the real part. That H2.23|l does not depend on T reflects the fact 
that these semi-Riemannian structures are flat. Of course we can restrict these 
semi-Riemannian structures to the subspaces £o(R-"), 'Co(C"') of linear trans- 
formations with trace equal to 0. 

The exponential mapping defines a mapping from £(R"), £(C") to G'i(R"), 
G'i(C") respectively, sending to / and with 

(2.24) dexpo(A)=A 

In particular, the standard flat metric at on £(R"), £(C") corresponds exactly 
to the semi-Riemannian structure on GL(R"), Gi(C") at /, respectively, under 
the differential of the exponential mapping at 0. In fact, 

(2.25) (dexpj.(A),rfexpj.(B))oxpT 

= tr(exp(-T))(dexpy(^))(exp(-r))(dexpj,(B)) 

agrees with 

(2.26) IyAB 

to another term in the Taylor expansion, which is to say up to terms of order 
0{\\T\\lp). In other words, using cxp(-T) =1 -T + 0{\\T\\lj,), 

(2.27) de^vAA) ^A+\{TA + AT) + Om\%). 

and similarly for _B, one can check that the terms in (|2.25|l with no T's reduce 
to tr(yli?), and that the terms with exactly one T cancel out. In the complex 
case, let us note that the exponential mapping is holomorphic, and one has the 
analogous statement about the holomorphic semi-Riemannian metrics on £(C") 
and GL(C") agreeing at up to terms of order OdlTjl^p). 

As a consequence, for each linear transformation A on R" or on C", exp(i A) 
satisfies the equation for geodesies at t = 0. This is because t A \s simply a 
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straight line in /!(R") or £(C"), as appropriate, and thus satisfies the equation 
for geodesies there with respect to the flat semi-Riemannian structure being 
used, and because the exponential mapping takes the flat semi-Riemannian 
structures on £(R"), £(C") around to the semi-Riemannian structures on 
GL(R"), GL(C") around / to suSicient precision, as in the preceding para- 
graph. In fact, it follows that ex.p{tA) satisfies the equation for geodesies for all 
t, because one can use the invariancc of the semi-Riemannian metrics on £(R"), 
£(C") under ordinary translations and the invariance of the semi-Riemannian 
structures on GL(R"), GL(C") under left and right multiplication by invertible 
linear transformations to reduce the case of general t to t = 0. In the complex 
case, one can take f to be a complex parameter, and say that exp{t A) is a holo- 
morphic geodesic in Gi(C") with respect to the holomorphic semi-Riemannian 
structure as before. 

If A, Y are linear transformations on R" or on C" with Y invertible, then 

Y exp{tA) defines a geodesic in Gi(R") or Gi(C"), as appropriate, again using 
the invariance of the semi-Riemannian structures that we have defined. This is 
equivalent to saying that if B, Y are linear transformations on R" or on C", 
then exp{tB) Y defines a geodesic in Gi(R") or Gi(C"), as appropriate, since 

Y cxp(t^) = exp(t i?) Y with B = Y AY~^, and anyway our semi-Riemannian 
structures on the general linear groups are invariant under both left and right 
multiplications. This accounts for all of the geodesies, because the equation 
for geodesies arc described by a second-order differential equation, and thus 
a geodesic is characterized by a point that it passes through and the tangent 
vector corresponding to its derivative at that point. 

The preceding discussion can also be applied to the restriction of the ex- 
ponential mapping to the subspaces £o(R"), Co{C^) of £(R"), £(C") taking 
values in the subgroups SLCR"), 51.(0") of GL(R"), GL(C"), using the restric- 
tions of the corresponding semi-Riemannian metrics. In particular, S'L(R"), 
S'L(C") are totally geodesic submanifolds of Gi(R"), GL(C"). That is, a 
geodesic in GL(R") or GL(C") which passes through SL{W) or SL{C") and 
is tangent to the special linear group at the point of intersection stays in the 
special linear group. Note that 5L(C") is a complex submanifold of GL(C"). 

Fix a positive integer n, and let he a flag in R" or C", which is to say a 
family Li,L2, . ■ . ,Lk of distinct nontrivial proper linear subspaces of R" or C" 
with 

(2.28) Li C La C ■ . • C Lfe. 

Thus is a positive integer strictly less than n, called the length of the flag. It 
may be that = 1, so that the flag consists of a single nontrivial proper linear 
subspace. 

If .F is a flag in R" or in C", then we write £:f(R") or £jr(C") for the space 
of linear transformations A on R" or C", as appropriate, such that A{Lj) C Lj 
for each of the hnear subspaces Lj in the flag. Thus £j;r(R"), £jr[C'^) are 
themselves linear subspaces of £(R"), £(C"), respectively, which are also closed 
under taking products of linear transformations. By using bases for R" or C", as 
appropriate, which are suitably adapted to the flag one can also characterize 



29 



the linear transformations in £jir(R") or £;f(C"), as appropriate, in terms of 
matrices with certain entries equal to 0. 

Similarly, if ^ is a flag in R" or in C", then we write GLjr{W) or GLj^iC) 
for the space of invertible linear transformations T on R" or C", as appropriate, 
such that T{Lj) = Lj for each linear subspace Lj in the flag. This is equivalent 
to saying that 

(2.29) GLj^iW) = GL(R") n £jp(R") 

and 

(2.30) GL^(C") =GL(C")n/:^(C"). 
Furthermore, let us put 

(2.31) SLj.{IV) = ^^(R") n £:p(R") 

and 

(2.32) 5L^(C") = 5^(0") n/:^(C"). 

The exponential mapping can be restricted to £:f(R") or £:f(C") to get a 
mapping into GL;r(R") or GLj^^C^), as appropriate. One can restrict a bit 
further to linear transformations in £jr(R") or £jr(C") with trace equal to 0, 
which the exponential mapping sends to linear transformations in ^^^^(R") 
or SLjr{C''^). As before, one can account for all of the geodesies in GI/jf(R"), 
GLj7{C") or in S'Ljtr(R"), SLjr{C") using exponentials, and these define totally 
geodesic submanifolds of GL(R"), GL(C"), respectively. 

Let us write 5(6/ i?"), iS(C") for the real vector spaces of self-adjoint linear 
transformations on R", C", respectively, and iS+(R"), iS+(C") for their open 
cones of positive-definite linear transformations. These subsets are invariant 
under the transformation F{T) = T~^, and also under the action 

(2.33) T^-^Z*TZ 

for each Z in GZ/(R"), GL^C"), as appropriate. In fact, this action is transitive, 
which is to say that for each Ti, T2 in 4S+(R") or in iS+(C") there is a Z in 
Gi(R") or GL(C"), as appropriate, such that Z* Ti Z = T2. 

The restriction of our semi-Riemannian structures on GL(R"), GI/(C") to 
iS+(R"), 5+(C"), respectively, are Riemannian metrics, which is to say that 
they are positive definite. Of course these Riemannian metrics are invariant 
under the transformations preserving iS+(R"), iS+(C") mentioned in the previ- 
ous paragraph. The exponential mapping sends <S(R"), 5(R") onto iS+(R"), 
iS+(C"), respectively, and the geodesies in 5+(R"), <S+(C") through / are ex- 
actly the curves exp{tA), with A in 5(R"), <S(C"), respectively. The geodesies 
through a point T = Z* Z are of the form Z* exp(i A) Z. 

The orthogonal and unitary groups on R", C" are denoted 0(R") and 
J7(C") and consist of the invertible linear transformations T which are orthog- 
onal or unitary, respectively, which is to say that 

(2.34) T*T = TT* = I. 
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It is enough to have T* T = I, and at a point T in 0(R") or t/(C") the tangent 
space to the orthogonal or unitary group consists of the Unear transformations 
A on R" or C", as appropriate, such that 

(2.35) T* A + A*T = 0. 

Let us note that the orthogonal and unitary groups are compact smooth sub- 
manifolds of the vector spaces of all linear transformations on R", C", and of 
G'L(R"), GL{C") in particular. 

Again we can restrict our semi-Riemannian structures from GL(R") or 
Gi(C") to the submanifolds given by the orthogonal and unitary groups, re- 
spectively. Now these restricted structures are negative-definite, so that their 
negatives are Riemannian metrics. Using the group structure we again have 
the mappings Az(T) = ZT and pz{T) = T Z which send the orthogonal and 
unitary groups to themselves as long as Z also lies in the orthogonal or unitary 
group, and also the mapping F{T) = T^^ takes the orthogonal and unitary 
groups to themselves as well. The negative Riemannian metrics on the orthog- 
onal and unitary groups are preserves by these transformations. If A is an 
anti-self-adjoint linear transformation on R" or on C", then exp{tA) defines a 
geodesic in 0(R") or U'(C"), as appropriate, and this accounts for all geodesies 
in the orthogonal and unitary groups through /, and hence for all geodesies if 
one also takes into account the left or right translation mappings Xz, pz, as 
before. 

In these various case one can restrict further to linear transformations with 
determinant equal to 1. Let us write M{R"), A4(C") for the hypersurfaces in 
>S+(R"), <S+(C") consisting of linear transformations with determinant equal to 
1, and S'0(R") and SU{C'^) for the special orthogonal and unitary groups on 
R" and C", which are the subgroups of 0(R"), [/(C") determined by the con- 
dition that the determinant of the corresponding linear transformation be equal 
to 1. There are similar considerations as before concerning tangent vectors, 
Riemannian structures, geodesies, and so on. 

3 Some geometric situations 

As usual, Z denotes the integers, and Z" consists of n-tuples of integers. Some- 
times we might refer to Z" as the standard integer lattice in R". If we say that L 
is a lattice in R", then we mean that there is an invertible linear transformation 
A on R" such that 

(3.1) L = A(Z"). 

If i is a lattice in R", then we can form the quotient space R"/L. That is, 

two vectors x, y in R" are identified in the quotient if their difference x — y lies 
in L. In particular, we get a canonical quotient mapping 

(3.2) g : R" ^ R"/L 

which sends a vector x in R" to the corresponding element of the quotient. 
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Now, with respect to ordinary vector addition, R" is an abelian group, and a 
lattice L is a subgroup of R". We can think of the quotient R"/L as a quotient 
in the sense of group theory. The quotient is an abehan group under addition, 
and the canonical quotient mapping is a group honiomorphism. 

We can also look at the quotient R"/L in terms of topology. Namely, it 
inherits a topology from the one on R" so that the canonical quotient mapping 
is an open continuous mapping, which means that both images and inverse 
images of open sets are open sets, and indeed the canonical quotient mapping is 
a nice covering mapping, so that for every point x in R" there is a neighborhood 
U of X in R" such that the restriction of to [/ is a homcomorphism from U 
onto the open set q{U) in R"/i. For that matter we can think of R"/L as a 
smooth manifold, with the quotient mapping q as a. smooth mapping which is 
a local diffeomorphism. 

Suppose that Li, L2 are lattices in R", and let 

(3.3) gi:R"^R"/Li, : R" ^ R"/J^2 

be the corresponding canonical quotient mappings. If A is an invertible linear 
transformation on R" such that 

(3.4) A{L,) = L2, 
then we get an induced mapping 

(3.5) I:R"/Li ^R"/L2. 

This mapping is a group isomorphism and a homeomorphism, and even a dif- 
feomorphism, which satisfies the obvious compatibility condition with the cor- 
responding canonical quotient mappings qi, q2, namely qi o A = A o q2. 

When n = 1, one can consider the lattice 27rZ consisting of integer multiples 
of 2tt, and it is customary to identify R/27rZ with the unit circle T in the 
complex numbers C, 

(3.6) T = e C : |^;| = 1}, 

where 1^1 denotes the usual modulus of 2; e C, l^;] = {x'^+y'^Y^'^ when z = x+i y, 
a;, y G R. More precisely, exp(i t) is an explicit version of the canonical quotient 
mapping from R/27rZ onto T with respect to this identification, which is a 
local diffeomorphism and a group homomorphism using the group structure of 
multiplication on T. In general, we can identify R"/27rZ" with T", the n- 
fold Cartesian product of T, where 27rZ" denotes the lattice of points whose 
coordinates are all integer multiples of 27r. 

Suppose that L is a lattice in R". Also let A be an invertible linear map- 
ping on R" such that A(27rZ") — L. Thus A is a group isomorphism and a 
diffeomorphism from R"/27rZ" ^ T" onto R"/L. 

There is a more precise way to look at the quotient of R" by a lattice, which 
is to say that the quotient space has a kind of local affine structure. That 
is, there is a local affine structure in which the canonical quotient mapping is 
considered to be locally affine, and which permits one to say when a curve in the 
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quotient is locally a straight line segment, like an arc on a line, and when it has 
locally constant speed, etc. If Li, L2 are lattices in R" and A is an invertible 
linear mapping on R" such that A{Li) = L2, then the induced mapping A from 
R"/Li onto R"/L2 preserves this local afSnc structure on the quotient spaces. 

There is an even more precise way to look at the quotient R"/L of R" by a 
lattice L, which is that it has a local flat geometric structure, induced from the 
one on R". With respect to this structure one can make local measurements of 
lengths, volumes, and angles, like the length of a curve, the angle at which two 
curves meet at a point, or the volume of a nice subset. In technical terms this 
can be seen as a Riem,a,nnian m.etric. 

In particular, one can define the volume of such a quotient R^/L, where the 
volume of R"/Z" is equal to 1, and the volume of R"/27rZ" is equal to (27r)". In 
general, if Li, L2 are lattices in R" and A is an invertible linear transformation 
on R" such that A{Li) = L2, then the volume of R"/L2 is equal to |detyl| 
times the volume of R"/Li, and more generally if £ is a nice subset of R"/Li, 
then the volume of A{E) in IV^ / L2 is equal to | dot A| times the volume of A 
in R"/Li. This is a variant of the fact that on R" a linear transformation A 
distorts volumes by a factor of | det A\, where det A denotes the determinant of 
A. 

Suppose that Li, L2 are lattices in R", and that T is an invertible linear 
transformation on R" such that T{Li) = L2- Recall that T is an orthogonal 
transformation on R" if T is invertible with inverse given by the adjoint, also 
known as the transpose, of T, and that this is equivalent to saying that T 
preserves the standard norm of vectors in R", and the standard inner product 
of vectors in R". In other words, orthogonal transformations on R" are linear 
mappings which preserve the geometry in R", and for the lattices Li, L2 and 
the quotients of R" by them we have that the induced mapping T from R"/ii 
onto R"/L2 preserves the geometry as well. 

In short, quotients of R" by lattices are the same in terms of group struc- 
ture, topological and even smooth structure, and affine structure, and not in 
general for more precise geometry. The volume of the quotient space is one 
basic parameter that one can consider. It is also interesting to look at closed 
curves in the quotient which are locally flat, their lengths, the angles at which 
they meet, and so on. 

We can consider lattices in C" as well. In this regard we can identify C" 
with R^" in the usual manner, so that the real and imaginary parts of the n 
components of an element of C" give rise to the 2n components of an element 
of R^", and then define a lattice in C" to be a lattice in R^" ~ C". We write 
Z[i] for the Gaussian integers, which are complex numbers of the form a + ib, 
where a, b are integers, and (Z[?;])" for the lattice in C" consisting of n-tuples 
of Gaussian integers, which is also called the standard integer lattice in C". 

If L is a lattice in C", then the quotient C"/L inherits a complex structure 
from C". This means in particular that the tangent spaces of the quotient are 
complex vector spaces, just as they are for C". If Li, L2 are lattices in C" and 
A is an invertible complex-linear transformation on C" such that A{Li) = L2, 
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then A induces a mapping A from C"/ii to C"/i2 which preserves this complex 
structure. 

We can combine the complex and Riemannian structures and consider Her- 
mitian structures. Basically this means looking at correspondences between 
lattices in C" which come from unitary mappings on C". If Li, L2 are lat- 
tices in C" and T is a unitary mapping on C" such that T{Li) = L2, then 
the induced mapping T from C"/Li to C"/L2 preserves both the complex and 
Riemannian structures. 

Let us focus on complex structures for a moment. It will be convenient to 
write £(R'",C") for the space of real-linear mappings from R™ to C". The 
complex structure on C" is still relevant for this space, in that £(R'", C") is nat- 
urally a complex vector space, because one can multiply elements of £(R'", C") 
by i. One can also describe these linear transformations by to x n matrices of 
complex numbers in the usual manner, using the standard bases for R™ and 
C". 

Let us write £*(R'",C") for the subset of ^(R^.C") consisting of linear 
transformations whose kernels are trivial, at least when to < 2n, so that this 
is possible. Using the usual Euclidean topology for £(R'", C"), £*(R'", C") is 
an open set. When to = 2n, £*(R™,C") consists of the invertible real-linear 
transformations from R™ onto C", and a lattice in C" is the image of Z^" 
under an element of £*(R2", C"). 

Now let us look at general lattices in C", under the equivalence relation in 
which two lattices Li, L2 are considered to be equivalent if there is an invertible 
complex-linear transformation A on C" such that ^(^i) = L2. This leads to 
an equivalence relation on £*(R^", C"), in which two elements of £*(R^", C") 
are considered to be equivalent if one can be written as the composition of 
an invertible complex-linear transformation on C" with the other element of 
£*(R2«, C"). In other words, we look at the action of GL(C") on £*(R2", C") 
by post-composition. 

Actually, it is more convenient to consider £f (R^", C"), which we define to 
be the subset of £*(R^", C") consisting of invertible real- linear transformations 
from R^" to C" such that the image of the first n standard basis vectors in R^" 
are linearly- independent over the complex numbers as n vectors in C". This 
restriction is not too serious, and indeed we can describe the lattices in C" as 
images of Z^" under mappings in £|(R^", C"). In other words, if we start with 
a lattice L given as the image of Z^" under an element of £*(R", C"), we can 
rewrite it as the image of Z^" under a linear transformation in £*(R^", C") by 
pre-composing the initial linear transformation from R^" to C" with an invert- 
ible linear transformation on R^" which permutes the standard basis vectors in 
a suitable way. 

To deal with the action of GL(C") by post-composition, we can restrict 
ourselves to £**(R^",C"), which we define to be the space of invertible real- 
linear transformations from R^" to C" such that the images of the first n 
standard basis vectors in R^" are the n standard basis vectors in C", and in 
the same order. In other words, if we identify R^" with the Cartesian product 
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R" X R", then these are the invertible real-hnear transformations from R" x R" 
onto C" with the property that on R" x {0} they coincide with the standard 
embedding of R" into C". This exactly compensates for the action of GL{C") 
by post-composition, since for any collection wi, . . . ,i;„ of linearly- independent 
vectors in C" there is a unique A G GL(C") such that A{vi), . . . , A{vn) are the 
standard basis vectors in C", in order. 

We can identify C**(K'^", C") with an open subset of £(R", C"). That is, 
elements of >C**(R^",C") can be identified with linear transformations from 
R" X R" into C", and these linear transformations are determined by what 
they do on {0} x R", since their behavior on R" x {0} is fixed by definition. 
We can think of elements of £(R", C") as being written as A + i B, where A, 
B are linear transformations on R", and one can check that the elements of 
£**(R2", C") correspond exactly to elements of £(R", C") of the fovm A + iB, 
where A, B are linear transformations on R" and B is invertible. 

To be more precise, it is helpful to think in terms of real-linear mappings on 
C", which can be written as 

(3.7) T{x + iy)= Ei{x) + E^iy) + i{Ei{x) + Ei{y)), 

where x,y & R". The passage to ^^(R^", C") can be expressed in these terms 
as the restriction to invertible real-linear transformations T on C" of the form 

(3.8) T{x + iy)=x + A{y)+iB{y), 

where A, B are linear transformations on R". The condition of invertibility of 
T is equivalent to the invertibility of B on C". 

Another way to look at real-linear mappings on C" is as mappings of the 
form 

(3.9) T{z)^ M{z) + N{z), 

where z G C", M and A'' are complex- linear mappings on C", and for w € C", 
W is the element of C" whose coordinates are the complex-conjugates of the 

coordinates of w. 

Invertibility of T is a bit tricky, and as an important special case, it is natural 
to restrict our attention to mappings T as above for which M majorizes N in 
the sense that 

(3.10) |A^(z)| < \M{z)\ 

for z G C", z 0, where \w\ denotes the standard Euchdean norm of w G C". 
To factor out the action of GL{C") by post-composition, we can restrict our 
attention to real-linear transformations T of the form 

(3.11) T{z) = z + E{^, 

where is a complex-linear transformation on C" with operator norm strictly 
less than 1, which is equivalent to saying that E* E < I. This has nice features 
when we think of the image of the standard integer lattice (Z[i])" under T, with 
points in the image being reasonably- close to their counterparts in the original 
lattice. 
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The n = 1 case is quite instructive. We can write a real-linear transformation 
T on C as 

(3.12) T{x + iy) = ax + iby 

for x,y gR, where a, b are complex numbers, and when T is invertible we can 
rewrite this as 

(3.13) T{x + iy) = a{x + icy), 

where a, c are complex numbers with a ^ and c having nonzero imaginary 
part. Alternatively, we can write a real-linear transformation T on C as T{z) = 
a z + (31 with a,(3 G C, and where T is invertible if and only if \a\ ^ \(3\, and 
when I a I > |/3| this can be rewritten as 

(3.14) T{z) = 9{z + txz), 

where 6* is a nonzero complex number and /i is a complex number such that 

ImI < 1- 

Let us return now to the real case. Consider the quotient space 0(R")\GL(R"), 
in which two invertible linear transformations on R" are identified if one can be 
written as an orthogonal linear transformation times the other. We can identify 
this quotient space with the space of symmetric linear transformations on R" 
which are positive definite, through the mapping 

(3.15) T^T*T. 

In other words, if T is an invertible linear transformation on R", then T* T is a 

symmetric linear transformation on R" which is positive-definite, Tj* Ti = T2 
for Ti,T2 G GL(R,") if and only if r2 = RTi for some orthogonal transformation 
R, and every symmetric linear transformation on R" which is positive-definite 
can be expressed as T* T for an invertible linear transformation T . 

Similarly, the quotient S'0(R")\S'L(R") can be identified with the space 
A^(R"') of symmetric linear transformations on R" which are positive definite 
and have determinant equal to 1. Let us write E(R") for the elements of ^^(R") 
whose matrices with respect to the standard basis have integer entries. The 
inverse of a linear transformation in S(R") also lies in S(R"), because Cramer's 
rule gives a formula for the matrix of the inverse which shows that it has integer 
entries when the original matrix has integer entries and determinant equal to 1. 

Elements of S(R") can be described as the invertible linear transformations 
which take Z" onto itself. The quotient 5'-L(R")/S(R") describes the space of 
lattices L in R" such that the corresponding quotient R"/L has volume equal 
to 1 and for which there is an extra piece of data concerning orientation, and 
the double quotient S'0(R")\S'L(R")/S(R") deals with these lattices up to 
equivalence under rotation. By identifying S'0(R")\S'L(R") with A1(R"), the 
double quotient can be identified with the quotient of A1(R") by the action 
of E(R") defined by A T* AT, A e A1(R"), T e S(R"). This quotient is 
denoted 7W(R")/S(R"). 

In the complex case let us consider lattices L in C" which are of the form 
A((Z[i])") for some invertible complex-linear mapping A on C". It is natural to 
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look at these lattices up to unitary equivalence, which is to say that two lattices 
Li, L2 are equivalent if there is a unitary linear transformation T on C" such 
that T{Li) = L2. This leads to an equivalence relation on GL(C"), in which two 
invertible linear transformations Ai, A2 on C" are considered to be equivalent 
if there is a unitary linear transformation T on C" such that A2 = T Ai. The 
quotient of GL{C^) by this equivalence relation is denoted 



and can be identified with the space of self-adjoint linear transformations on C 
which are positive definite, through the mapping 



That is, for each element A of GL(C"), the product A* A is a, self-adjoint 
linear transformation on C" which is positive-definite, A^ Ai = A2 A2 for two 
elements Ai, A2 of GL{C") if and only if A2 = T Ai for some unitary linear 
transformation T on C", and every self-adjoint linear transformation on C" can 
be expressed as A* A for some invertible linear transformation A on C". 

Similarly, one can consider two elements Bi, B2 of 5L(C"') to be equivalent 
when there is a linear transformation U in the special unitary group 5'C/(C") 
such that A2 = UAi. The quotient S'[/(C")\5L(C") can be identified with the 
space of self-adjoint linear transformations on C" which are positive-definite 
and have determinant 1, through the same mapping as before. Let us consider 
lattices L of the form B((Z[i])") for some B e SL{C"), a modest normalization. 

As in the real case we write S(C"') for the subgroup of SL{C") of linear 
transformations whose associated n x n matrices, with respect to the standard 
basis for C", have integer entries, which implies that the matrices associated 
to their inverses also have integer entries. Thus B{{Z[i])") = (Z[i])" when 
B e S(R"), and conversely B e SL{C") and B((Z[i])") = (Z^)" implies 
that B e S(C"). The quotient 6'L(C")/E(C") represents the space of lattices 
under consideration, the double quotient 5t/(C")\5i(C")/S(C") represents 
the space of these lattices modulo equivalence under special unitary transfor- 
mations, and this double quotient can be identified with the quotient of the 
space AiiC^) of self-adjoint positive-definite linear transformations on C" with 
determinant 1 by the action of S(C") defined hy P ^ B* PB, B € E(C"). 

Next we consider real and complex projective spaces. Namely, if n is a pos- 
itive integer, then the n-dimensional real and complex projective spaces RP", 
CP" consist of the real and complex lines through the origin in R"+^, C"+^, re- 
spectively. To put it another way, if R*, C* denote the nonzero real and complex 
numbers, respectively, then we have natural actions of R,, C» on R"+^\{0}, 
C""'"^\{0} by scalar multiphcation, and the projective spaces are the correspond- 
ing quotient spaces. Thus two nonzero vectors v, w in R"+^, C"+^ lead to the 
same point in the corresponding projective space exactly when they are scalar 
multiples of each other. Note that we get canonical mappings from R"+-'^\{0}, 
C""'"^\{0} onto RP", CP", in which a nonzero vector v is sent to the line 
through the origin which passes through v. 



(3.16) 



t/(C")\GL(C") 



(3.17) 



A e GL{C") ^ A* A. 
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If L is a nontrivial linear subspace of R"+-'^, C"+^ of dimension I + 1, say, 
then we get an interesting space P{L) consisting of all lines through the origin 
in L, which we can think of as sitting inside of RP", CP", as appropriate. More 
precisely, P(i) is basically a copy of RP' or CP'. These are the ^-dimensional 
"linear subspaces" of projective space, analogous to linear subspaces of R", C". 

If A is an invertible linear transformation on R"+^ or on C""*"^, then A 
takes lines to lines, and induces a transformation A on the corresponding pro- 
jective space. Notice that A is automatically a one-to-one transformation of the 
corresponding projective space onto itself, with 

(3.18) = (1^1), 

and A maps linear subspaces of projective space to themselves, in the sense 
of the preceding paragraph. Also, if Ai, A2 are invertible linear transforma- 
tions on R"+i or on C"+i, then the induced transformations Ai, A2 on the 
corresponding projective space satisfy 

(3.19) Ai 0I2 = (Afols). 

Let H he a hyperplane in R"+i or in C""'"^, which is to say a linear subspace 
of dimension n, and let w be a nonzero vector in R""'"-'^, C"+i, as appropriate. 
This leads to an affine hyperplane H + v, consisting of all vectors of the form 
w + V, w £ H, and which does not contain the vector 0. For each w & H, we 
can look at the line through w + v, which we can view as an element of the 
corresponding projective space. 

In other words, we basically get an embedding of H into the corresponding 
projective space, RP" or CP". Of course we can also think of H as being 
isomorphic to R" or C", so that we are really looking at a bunch of embeddings 
of R", C" into RP", CP", respectively. For instance, we can do this with H 
equal to the jth coordinate hyperplane in R"^^, C"+"'^, 1 < j < n + 1, which 
is defined by the condition that the jth coordinate of vectors in H are equal to 
0, and we can take v to be the j'th standard basis vector, with jth coordinate 
equal to 1 and the other n coordinates equal to 0. 

These n+1 embeddings of R", C" into RP", CP" corresponding to the n+1 
coordinate hyperplanes in R"+i, CP"'''^ are sufficient to cover the projective 
space, i.e., every point in projective space shows up in the image of at least one 
of the embeddings. For a given hyperplane H, the set of points in the projective 
space which do not occur in the embedding of H is the same as P{H). Thus 
the set of missing points in the projective space lie in a projective subspace of 
dimension 1 less. 

Using these embeddings of R", C" into the corresponding projective spaces, 

we can think of the projective spaces as being manifolds. That is, these em- 
beddings provide local coordinates for all points in the projective space. Two 
different embeddings which contain the same point p in their image are compat- 
ible in terms of topology and also smooth structure, in the complex case we can 
say that CP" is a complex manifold. In the real and complex situations there is 
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a finer "projective" structure which is reflected in the presence of nice projective 
subspaces, for instance, and the projectivized versions of linear transformations 
on R"+i, C"+^ 

Note that two invcrtible hnear transformations Ai, A2 on R"+^ or on C"+-'^ 
lead to the same induced transformation on projective space if and only if there 
is a nonzero scalar a such that A2 = aAi. Thus the group of these "projective 
linear transformations" has dimension (n + 1)^ — 1 over the real or complex 
numbers, as appropriate. Also, for any pair of points p, q in a projective space, 
there is a projective linear transformation which takes p to q. 

If fc, n are positive integers with k < n, then the Grassmann spaces G-R,{k, n), 
Gc{k,n) consist of the fc-dimensional linear subspaces of R", C", respectively. 
When k = 1 this reduces to the (n — l)-dimensional projective spaces. Suppose 
that L, M are linear subspaces of R" or of C" which are complementary, in the 
sense that 

(3.20) L n M = {0} and L + M = R" or C", 

as appropriate, and that L has dimension k, so that M has dimension n — k. 
If A is a linear mapping from L to M, then the graph of A, consisting of the 

vectors 

(3.21) v + A{v), vgL, 

is also a fc-dimensional subspace of R" or of C", as appropriate. In this way 

we can embed the vector space of linear transformations from L to M into the 
Grassmannian, and this provides a nice coordinate patch around L itself. 

In particular, these coordinate patches permit one to view the Grassmann 
spaces as smooth manifolds, and as complex manifolds in the complex case. The 
dimension of GB.{k,n), Gc{,k,n) is equal to 

(3.22) k{n - k) 

with respect to the real or complex numbers, as appropriate. Just as for projec- 
tive spaces, invertible linear transformations on R" or on C" induce interesting 
mappings on the corresponding Grassmannians. These actions are again transi- 
tive, because if Li, L2 arc fc-dimcnsional linear subspaces of R" or of C", then 
there is an invertible linear transformation A on R" or on C", as appropriate, 
such that A{Li) = L2- 

There is a natural correspondence between the Grassmann spaces of k- 
dimensional linear subspaces in R" or in C" and the Grassmann spaces of 
(n — fc)-dimensional linear subspaces of R" or in C", respectively. More pre- 
cisely, there is a natural correspondence between fc-dimensional linear subspaces 
of R" or of C" and (n — A;)-dimensional linear subspaces of the dual spaces 
associated to R" or C", consisting of the linear functional on R" or on C". 
Namely, if L is a fc-dimensional linear subspace of R" or of C", then one gets an 
(n — fc)-dimensional linear subspace of the dual space by taking the linear func- 
tional which vanish on L. Conversely, if one starts with an (n — fc)-dimensional 
subspace of the dual space, one gets a fc-dimcnsional subspace of the original 
space by taking the intersections of the kernels of the linear functionals in the 
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subspacc of the dual space. Using the standard basis for R" or C", or any 
other basis for that matter, one can identify R", C" with their dual spaces in 
a well-known manner. 

Suppose that L is a linear subspacc of R" or of C" of dimension /. If / > k, 
then we get an interesting subspace of the Grassmann space of fc-dimensional 
linear subspaces of R" or C", as appropriate, consisting of the fc-dimensional 
linear subspaces contained in L. If A; > I. then there is another interesting sub- 
space of the Grassmann space, consisting of the fc-dimensional linear subspaces 
of R" or C" which contain L. When k = I, these two cases are the same and 
we get simply a point in the Grassmann space. 

Let n be a positive integer. By a multiindex we mean an n-tuple a = 
(ai, . . . , an) of nonnegative integers, and in this case we set 

n 

(3.23) \a\=Y^\aj\. 

i=i 

Given a multiindex a, we define the corresponding monomial w°' on R" or on 
C" by 

(3.24) w"=<^ 

where w = {wi - .... w;„) as usual, and we call \a\ the degree of this monomial. 
When aj = we interpret w'^' as being equal to the constant 1. 

A polynomial p{w) on R" or on C" is a function which is a linear combination 
of monomials. We take the coefficients to be real or complex numbers according 
to whether we are working on R" or on C". A polynomial p{w) which is a linear 
combination of monomials of the same degree a is said to be homogeneous of 
degree a. This is equivalent to the condition that 

(3.25) p{Xw) = y-p{w) 

for all real or complex numbers A and all w in R" or in C", as appropriate. 

Suppose that pi (w), . . . , p„+i (w) are polynomials on R"+^ or on C"+^ which 
are homogeneous of the same degree a > 0. Assume also that 

(3.26) pi{w) = ■ ■ ■ = pn+i{w) = 

only when w = 0. The combined mapping p{w) = {pi{w), . . . ,Pn+i{w)) is a ho- 
mogeneous polynomial mapping of R""'""'^ or C""'""'^ into itself which maps nonzero 
vectors to nonzero vectors, and as a result induces a mapping p from RP" or 
CP" to itself, as appropriate. The degree 1 case corresponds exactly to invert- 
ible linear mappings on R"+^ or on C"+^ and the associated projective linear 
transformations on the corresponding projective spaces, li p = {pi, . . . ,pn+i), 
q = (gi, . . . , Qn+i) are homogeneous polynomial mappings on R"+^ or on C"+^ 
of this type, of degrees a,b > 0, respectively, then the composition p o q is a. 
homogeneous polynomial mapping of degree a6, and jfo~q ~ p oq. 

When n = 1, we can think of RP\ CP^ as being the same as R, C with an 
additional point added, often denoted oo. Projective linear transformations on 
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RP\ CP^ can then be described as mappings of the form {a z + b)/{c z + d), 
where ad — be 0, and using standard conventions along the Unes of 1/0 = oo, 
l/oo = 0. Similarly, the mappings associated to homogeneous polynomials as 
in the preceding paragraph reduce to nonconstant rational functions of a single 
variable. 

4 Immersions, submersions, and connections 

Let M, N be nonempty m, n-dimensional smooth manifolds, so that M, N 
look locally like R™, R" in a sense, and they have countable bases for their 
topologies. As basic situations, M, N might in fact be open subsets of R™, 
R", respectively. They might instead be given as embedded submanifolds of 
higher-dimensional Euclidean spaces. 

Suppose that / is a smooth mapping from M into TV. For each point p G M , 
the differential of / at p is denoted dfp and is a linear mapping from the tangent 
space of M at p, which is denoted TpM and is an m-dimensional real vector 
space, into the tangent space Tf^p-^N of N at /(p), an n-dimensional real vector 
space. We say that / is an immersion if dfp is an injective linear transformation 
from TpM into Tf^p^N for each p G M, which is to say that the kernel of dfp is 
trivial for all p G M , and in which case m < n. We say that / is a submersion if 
dfp maps TpM onto Tf(^p-jN for all p G M, in which case m > n. When m = n, 
these two conditions are equivalent to each other, and to the statement that dfp 
is a one-to-one linear mapping from TpM onto Tj^p-^N for each p G M. 

When m < n, we have the standard embedding of R™ into R", in which a 
point X = (xi, . . . , Xm) in R™ is sent to x = (xi, . . . , x„) in R", with Xj = Xj 
for 1 < j < m and Xi = for m < i < n. When m > n, we have the standard 
projection from R™ onto R", in which one keeps the first n coordinates of 
a point in R"* and drops the remaining m — n coordinates. By the implicit 
function theorem, immersions and submersions arc locally equivalent to these 
standard models, with the appropriate dimensions. When m = n, immersions 
and submersions are the same, and they are local diffeomorphisms, as in the 
inverse function theorem. 

If / : M — > is a submersion, then / is an open mapping in particular, 
which is to say that f{U) is an open subset of N whenever U is an open subset 
of M. If we also assume that / is proper, in the sense that f~^(K) is a compact 
subset of M whenever AT is a compact subset of N, then it is easy to check 
that f{E) is a closed subset of N when E is a. closed subset of M. This implies 
in turn that f(M) = iV when N is connected, since f{M) would then be a 
nonempty subset of TV which is both open and closed. 

Let us consider some examples. Fix a positive integer n, and for M take 
R""'"^\{0}. For N we can take the projective space RP", and we have a natural 
mapping from R""'"^\{0} to RP" which sends a nonzero vector v in R"+^ to 
the point in RP" corresponding to the line through v. This mapping is smooth 
and defines a submersion. 

We can also restrict this mapping to the unit sphere S" in R"+^, consisting 
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of the vectors v sueh that \v\ = 1. This mapping is stiU a smooth mapping 
from S" onto RP", with the dimensions of the domain and range now being 
the same. This mapping is a local diffeomorphism, and of course two vectors v, 
w in S" arc sent to the same point in RP" if and only if v = w or v = —w. 

Now take M to be C""'"^\{0}, which has real dimension 2n + 2, and let A'' 
be CP", which has real dimension 2n. Once again there is a natural mapping 
which sends a nonzero vector v in C"''"^ to the point in complex projective space 
CP" that corresponds to the line through v. This mapping is smooth, and in 
fact holomorphic, and it is also a submersion. We can also restrict this mapping 
to the sphere 8^"+^ consisting of the vectors v in C""*"^ such that \v\ = 1, to 
get a submersion onto CP". Two vectors v, w in S-^""*"^ are sent to the same 
point in CP" by this mapping if and only if w = a u for some complex number 
a such that \a\ = 1, so that the fibers of this submersion from 8^"+^ onto CP" 
are all circles. 

Suppose that M, A'' are smooth manifolds of dimensions m, n, respectively, 
and that / : M ^ TV is a proper smooth submersion, so that the fibers f~^(z), 
z £ N, are compact submanifolds of M of dimension m — n. Fix a point 
zi in N, and suppose that Vi is a neighborhood of zi in N and that is a 
smooth mapping from f~^{Vi) C M into f~^{zi) such that (f>(x) = x when 
f{x) = Zi. We can combine /, (p to get a smooth mapping (/, (j)) from f~^{Vi) 
into Vi xf~^{z-i). The differential of this combined mapping is invertible at each 
clement of the fiber f~^{zi), and it follows that there is a neighborhood V2 of zi 
contained in Vi such that the combined mapping (/, (j)) defines a diffeomorphism 
from f~^{V2) onto V2 x f~^{zi). In particular, if Z2 is an element of N which 
is sufficiently close to zi, then the fibers f~^{zi), f~^(z2) are diffcomorphic 
smooth manifolds of dimension m — n. If is connected, then it follows that 
all of the fibers f~^{z), z G N, are diffeomorphic to each other. 

Again let M, N be smooth manifolds of dimensions m. n, respectively, and 
let / : -M — > iV be a smooth submersion which may or may not be proper, at 
least for the moment. For each p e M, we get a linear subspace Vp of the tangent 
space TpM consisting of the tangent vectors to M at p which are also tangent 
to the fiber f~^{f{p)) of / passing through p. This can also be described as the 
kernel of the differential dfp of / at p, as a linear mapping from TpM to Tf(^p-jN. 
Of course Vp has dimension m ^ n for each p € M, because / is a submersion. 

By a connection on M with respect to the submersion f : M —>■ N we mean a 
choice of an n-dimensional linear subspace Hp of the tangent space TpM for each 
point p E M which is transverse to the vertical subspace Vp and which depends 
smoothly on p. We call Hp the horizontal linear subspace of the tangent space 
TpM determined by the connection, and the restriction of the differential dfp of 
/ at p to the horizontal subspace Hp of TpM is a one-to-one linear mapping of 
Hp onto the tangent space Tj(p)A^ of N at f{p)- To put it another way, at each 
point p G M the tangent space TpM of M at p is the direct sum of the horizontal 
and vertical subspaces Hp and Vp. The vertical subspace Vp is determined by 
/, and there is some room for making choices for the horizontal subspaces. 

As a basic scenario, suppose that M is also equipped with a smooth Rie- 
mannian metric, which is to say an inner product on each tangent space TpM 
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which depends smoothly on p. In this case, one can simply choose Hp to be the 
orthogonal complement of Vp in TpM with respect to the Riemannian metric. 
This shows that connections always exist, since Riemannian metrics exist on 
any smooth manifold. Recall that one way to choose a Riemannian metric on a 
smooth manifold M is to choose local Riemannian metrics in coordinate charts 
and combine them using a partition of unity, and another way is to embed M 
into a Euclidean space and then use the Riemannian metric inherited from the 
one on the Euclidean space. 

Let A be an invertible linear transformation on R" or on C" such that 

(4.1) lim A\v) = 

for all V in R" or C", as appropriate. A sufficient condition for this to hold is 
that the norm of A be strictly less than 1. In the complex case, this condition 
holds if and only if the eigenvalues of A all have modulus strictly less than 1, and 
in the real case one can complexify R" to convert A to a linear transformation on 
C", and the condition holds on R" if and only if it holds for the complexification 
on C", which is to say that the modulus of each of the eigenvalues of the 
associated linear transformation on C" should be strictly less than 1. Notice 
that our condition is also equivalent to 

(4.2) lim \A-\v)\ = 00 

for all nonzero vectors v in R" or C", as appropriate. 

Let us define a space 1-La by starting with R"\{0} or C"\{0}, as appropriate, 
and identifying two nonzero vectors v, w when 

(4.3) w = A\v) 

for some integer j. Thus Ha is a compact real or complex manifold, accord- 
ing to whether one starts with R" or C". There is a natural smooth mapping 
from R"\{0} or C"\{0} onto Ha, as appropriate, in which a nonzero vector v is 
mapped to the corresponding equivalence class in Ha and this mapping is holo- 
morphic in the complex case. Of course Ha has the same dimension as R"\{0} 
or C"\{0}, as appropriate, and the mapping to Ha is a local diffeomorphism. 

If n = 1, then ^4(11) = av for some nonzero scalar a with |a| < 1. In the 
real case, if a > 0, then A sends the positive real numbers to the positive real 
numbers and A sends the negative real numbers to the negative real numbers, 
and Ha is basically a disjoint union of two circles. If a < 0, then A maps 
the positive real numbers to the negative real numbers and vice-versa, and Ha 
reduces to a single circle. In the complex case, we can think of C\{0} as the 
same as C/27riZ. i.e., the quotient of C by translations by integer multiples 
of 27ri, using the exponential mapping from C onto C\{0}. If /3 is a complex 
number such that exp/? = a and L is the lattice in C consisting of complex 
numbers of the form 2nnri + nf3, m. n G Z, then Ha can be identified with C/L, 
which is to say that we get a 1-dimensional complex torus. 
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When n > 2, consider the special case where A is of the form A{v) = av for 
some nonzero scalar a such that |q:| < 1. In this case there is a natural smooth 
mapping from Ha onto RP"~^ or CP"~^, as appropriate, in which elements of 
Ti.A are sent to the lines in R" or C" that contain the corresponding vectors. 
This mapping is a submersion, and it is holomorphic in the complex case. The 
fibers of this mapping are copies of what one gets in the 1-dimensional case. 
Also, linear mappings on R" or C", as appropriate, induce interesting mappings 
on Ha- 

In general dimensions and for general A, suppose that L is a nontrivial linear 
subspace of R" or C", as appropriate, such that 

(4.4) A{L) = L. 

We can apply the same construction to get a space analogous to Ha for the 
restriction of A to L, and this space can be viewed as a submanifold of Ha- In 
particular, a 1-dimensional invariant subspace for A is the same as the span of 
a nonzero eigenvector of A, and this leads to a submanifold of Ha which is a 
copy of what one gets in the 1-dimensional case. Also, a linear transformation 
on R" or on C" which commutes with A leads to an interesting mapping on 
Ha. 

Suppose again that Af, N are smooth manifolds of dimensions to, n, that 
/ : M — > iV is a submersion, and that we have a connection on M associated to 
this submersion, defined by a smooth family Hp of horizontal linear subspaces 
of TpM, p G M. Fix a point p G M, and let a{t) be a smooth curve in N defined 
on an interval [a, b] in the real line such that 

(4.5) f{p) = a{a). 

Consider the question of having a smooth curve (3{t), a < t < b, in N such that 

(4.6) /3(a) =p 
and 

(4.7) /(/?(*)) = a{t) for all a < t < 6, 
so that l3{t) is a lifting of a{t) which starts at p, and also 

(4.8) ${t) G iJ^(t) for all a<t<b. 

Here ${t) denotes the derivative of P{t) at t, which is automatically an element 
of T^(t)M, and which satisfies 

(4.9) d/^(t)(/3(t))=d(t) 

for all t. This condition and the requirement that the derivative of 0(1) belong 
to the horizontal subspaces specified by the connection determine the derivative 
of /3(t) in terms of f3{t) and the derivative of a{t). 
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In other words, f3{t) satisfies an ordinary differential equation. More pre- 
cisely, in local coordinates one can convert this into a system of ordinary dif- 
ferential equations in R"* of the usual type. Standard results about ordinary 
differential equations imply that /3(f) is uniquely determined by a{t) and the 
starting point p, when such a lifting exists. Also, such a lifting always exists 
at least on a shorter interval beginning at a. If the submersion f : M ^ N 
is proper, then the lifting P{t) exists for the whole interval [a, &], and there are 
other conditions like this for lifting the whole curve as well, basically by ensuring 
that the lifted curve remain in a compact subset of M. 

Standard results about ordinary differential equations also imply smoothness 
results about mappings associated to liftings like these. Namely, one can vary 
the choice of p in the fiber /~^(a(a)), and get smooth dependence on p of the 
lifting. One can always do this locally, for t near a. If / : M ^ N is proper, so 
that we have liftings on the whole interval [a, b] , then the lifting of paths defines 
a mapping from the fiber f~^{a{a)) to the fiber f~^{a{b)), and this mapping is 
a bijection with the inverse mapping obtained by running the lifting backwards 
along the interval [a, b] . Smooth dependence on p implies that this mapping 
from the fiber f~^{a{a)) onto f~^{a{b)) is in fact a diffeomorphism. 

If / : M ^ A'' is proper and N is connected, then we get another way to 
see that the fibers of / are all diffeomorphic to each other. Namely, any pair of 
points in N can be connected by a smooth curve if N is connected. One can use 
liftings of this curve to get a diffeomorphism between the corresponding fibers, 
as above. Of course there are plenty of variations of these themes. 

Let us consider another example. Let U denote the upper half-space in 
C, which consists of complex numbers with positive imaginary part. We start 
basically with the Cartesian product C x [/, and the coordinate projection of 
this space onto U. This projection is obviously a holomorphic mapping and a 
submersion. 

For each a € U , let La be the lattice in C consisting ofm + na, m,n ^ Z. 
We can think of this as a fixed lattice in C, or as a family of lattices in a family 
of copies of C. For each a eU . let us write £{a) for the 1-dimensional complex 
torus C/ La. Let us write £ for the space that we get by identifying [z, a) and 
[w, a) mC xU when w — z & L^. 

Thus 5 is a complex manifold of dimension 2, and we get a holomorphic 
projection mapping from C x [/ onto £ sending a given pair (z, a) in C x t/ 
to the corresponding equivalence class in £. Locally £ looks like Cxi/, which 
is to say that the natural quotient mapping is locally a biholomorphism. We 
also have a natural mapping from £ onto U , which is to say that the standard 
coordinate projection from C xU pushes down to f in a natural way. That 
is, the standard coordinate projection from C x [/ is the same as the natural 
projection from C x C/ onto £ followed by the mapping from £ to U . This 
mapping from f to J7 is a proper holomorphic submersion. 

For each a G C/, we can identify the fiber in £ over a from our mapping 
£ U with £{a) in a simple way. As real manifolds, these fibers are all 
diffeomorphic to each other. However, these fibers are not equivalent in general 
as complex manifolds. In fact, for distinct nearby a's the corresponding f (q)'s 
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are not holomorphically equivalent. Locally £ is holomorphically equivalent to 
a product, and there is significant activity more globally. 

Now let M, N be smooth manifolds with dimensions m, n, let f : M ^ N be 
a smooth submersion, and let Hp, p E M , be a smooth family of horizontal linear 
subspaces of the tangent spaces of M defining a connection for the submersion. 
Thus each Hp has dimension n and the restriction of the differential dfp of / 
at p to Hp defines a one-to-one linear mapping onto Tj^p-^N . We would like to 
describe the curvature of this connection. Let us first review some aspects of 
vector fields on a smooth manifold. 

Let U he a nonempty open subset of M . A smooth vector field X on U 
assigns to each p e [/ a tangent vector X{p) to M at p, in a way which is 
smooth in p. Such a vector field defines a first-order linear differential operator 
acting on smooth real- valued functions on U . which is to say that if h is a 
smooth real-valued function on U, then X{h) at a point p eU is the directional 
derivative of h in the direction X{p) at p. These differential operators are linear, 
so that 

(4.10) X{ci hi + C2 h2) = d X{hi) + 02 X{h2) 

for all real numbers ci, C2 and all smooth functions hi, /12, and they satisfy the 
Leibniz rule 

(4.11) X{hih2)=X{hi)h2 + hiX{h2), 

for differentiating the product of two smooth fimctions hi, /12 on U . 

If Xi, X2 are two smooth vector fields on U, then one gets the associated 
Lie bracket [Xi,X2] of Xi, X2. In terms of differential operators, we have 

(4.12) [Xi,X2]{h) = Xi{X2(h)) - X2{Xi{h)) 

for all smooth functions h on U. Of course Xi{X2{hj), X2{Xi{h)) involve second 
derivatives of h, and these second derivatives cancel out in the difference, leaving 
a first-order operator associated to a vector field. If Xi, X2 are smooth vector 
fields on U and ^1, ^2 are smooth real- valued functions on U, then cpi Xi, (j)2 X2 
also define smooth vector fields on U, and we have that 

(4.13) [01 Xi,(j)2 X2] = (1)1 4l2 [Xl,X2] + 4>l Xi{(j)2)X2 - 02 -^^2(01) Xi. 

Now let us return to the setting of our submersion and connection, and 
assume that Xi, X2 are smooth vector fields on a nonempty open subset U of 
M such that Xi(p), X2(j)) are elements of the horizontal linear subspace Hp 
of the tangent space TpM of M at p for each p € U. Define the curvature 
C{Xi,X2) at a point p e J/ to be the vertical component of [Xi, X2] at p, which 
is to say that C{Xi, X2) at p is an element of the vertical linear subspace Vp of 
the tangent space TpM of M at p, and [Xi, X2] —C{Xi,X2) lies in the horizontal 
subspace Hp at p. If 0i , 02 are smooth real- valued functions on U, then 0i Xi , 
02 ^2 are also horizontal vector fields on U, and 

(4.14) C(0i Xi, 02 X2) = 01 02 C{Xi,X2). 
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This shows that the curvature C{Xi, X2), at a point p Q U, depends only on 
the values of Xi, X2 at p, and thus at p the curvature C{Xi,X2) defines an 
antisymmetric bilinear mapping from Hp x Hp into Vp, which depends smoothly 
on p. Using the isomorphism between Hp and Tf(^p-jN given by dfp, one can 
reformulate the curvature by saying that for each p G M it is an antisymmetric 
bilinear mapping from Tj(p)A/' x Tf(^p)N into Vp which depends smoothly on p. 

What docs it mean for the curvature to be equal to everywhere on M? 
This is equivalent to saying that if Xx, X2 are smooth horizontal vector fields 
on a nonempty subset U of M, then the Lie bracket [Xi, X2] is also a horizontal 
vector field on U . In other words, the curvature of the connection is equal to 
on all of M if and only if the corresponding distribution of horizontal linear 
subspaces of the tangent spaces is integrable. By a well-known theorem of 
Frobcnius, this means that there is a foliation of M by n-dimcnsional smooth 
submanifolds whose tangent spaces are exactly the horizontal linear subspaces 
of the tangent spaces of M given by the connection. 

5 Metric spaces 

By a metric space we mean a nonempty set M together with a real-valued 

function d{x,y) defined for x,y G M, called the distance function or m,etric on 
M, such that d{x, y) > for all x,y € M, d{x, y) = if and only if a; = y, 

(5.1) d{y,x) = d{x,y) 
for all x,y £ M, and 

(5.2) d{x,z) <d{x,y)+diy,z) 

for all x,y,z G M. A basic example if given by the real numbers R equipped 
with the standard metric |x — Recall that if x is a real number, then the 
absolute value of x is denoted \x\ and defined to be equal to x when x > and 
to —a; when a; < 0, and that 

(5.3) \x + y\<\x\ + \y\, \xy\ = \x\\y\ 
for all x,y gR. 

If {M,d{x,y)) and {N,p{u,v)) are metric spaces and f : M ^ N is a. 

mapping from M to N, then / is said to be continuous if for every x € M 
and every positive real number e there is a positive real number S such that 

(5.4) p{f{y)J{x))<e 

for all y E M such that d{y, x) < S. For instance, constant mappings are 
always continuous, and the identity mapping on a metric space (M,d{x,y)) is 
continuous as a mapping from M to itself. More generally, if {M,d{x,y)) is a 
metric space and E is a nonempty subset of M, then we can consider E to be 
a metric space itself, using the same metric d{x, y) restricted to E, and then 
the inclusion mapping of E into M, which sends each element of E to itself, is 
continuous as a mapping from E to M. 
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Let {M,d{x,y)) be a metric space, and let p be a point in M. Prom the 
triangle inequality it is easy to see that 



(5.5) 



d{x,p) -d{y,p) < d{x,y) 



for all a;, y € M, and similarly 



(5.6) 



d{y,p) -d{x,p) < d{x,y) 



so that 



(5.7) 



\d{x,p) -d{y,p)\ < d{x,y) 



for all x,y G M. It follows that the real- valued function fp{x) = d{x,p) on 
M is continuous, where, as usual, we employ the standard metric on the real 
numbers. 

If /ij ./2 are two real-valued continuous functions on a metric space (Af, d{x, y)), 
then the sum /i -I-/2 and the product /i /2 are also continuous functions. This is 
not too difficult to check. Similarly, if / is a continuous real-valued function on 
M such that f{x) ^ for all x G M, then 1/ f{x) is also a continuous function 
on M. 

Let (M, d{x, y)) be a metric space, and let A be a nonempty subset of M. 

We denote the distance from a point x in M to A by dist(.x. A), and we define 
it to be the infimum of d{x^ y) over all y ^ A. H A, B are two nonempty subsets 
of M, then the distance between A and B is denoted dist(^, B) and defined to 
be the infimum of d{x, y) over all .t G A and y G B. 

If A is a nonempty subset of M and x, y are elements of M, then one can 
check that 



for all x, y G M. In particular, dist(a;, A) is a real-valued continuous function of 
X on M . 

If A and B are nonempty subsets of M and i is a positive real number, 
then we say that A, B are t-close if for each a G A there is a 6 G i? such that 
d{a, b) < t, and if for each 6 G S there is an a G A such that d{a, b) < t. If A, B 
are nonempty subsets of M which are i-close for some positive real number t, 
then the Hausdorff distance from A to 5 is denoted D{A, B) and defined to be 
the infimum of the positive real numbers t such that A, B are t-close. In this 
case, if x is any point in M, then 



A subset A of M is said to be bounded if there is a point p in M and a 
positive real number R such that d(x,p) < R for all x G A. It is easy to see 
that once this holds for some p G A'/, it works for all p G Af, with a choice of 
R that depends on p. The diameter of a nonempty bounded subset A of Af is 
denoted diam A and defined to be the supremum of d{x, y) over all x, y G A. 




I dist(a;. A) - dist(j/, A)| < d{x,y) 



dist(a;, A) < dist(y, A) + d{x, y). 



(5.10) 



dist(a;. A) < dist(a;, B) + D{A, B). 
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If A, B arc bounded subsets of M, then clearly A, B are t-closc for some 
positive real numbers t, and thus the Hausdorff distance D{A, B) is defined. 
Of course B, A are f-close when A, B are f-close, so that D{A, B) = D{B, A). 
Also, if A, B, C are nonempty sTibsets of M and s, t are positive real numbers 
such that A, B are s-close and B, C are f-close, then A, C are s + t close, and 

(5.11) D{A, C) < D{A, B) + D{B, C). 

Suppose that (t>{x) is a monotone increasing real-valued function on the real 
line, so that (j){x) < 4>{y) when x, y are real numbers such that x < y. For 
each real number x, the left and right-sided limits of (j) at x, denoted ^(a;— ) and 
(f){x+), respectively, automatically exist and can be given by 

(5.12) (l>{x—) = sup{(^(w) : w < x}, <p{x+) = m{{(j){y) : y > x}. 
Clearly 

(5.13) (j){x-) < (t){x) < (l){x+), 
and (j) is continuous at x if and only if 

(5.14) (f>{x-) = </.(x+). 

For each positive real number p, one can show that the function \x\p is a 
continuous real-valued function on R. Fix a positive integer n, and for each 
positive real number p consider the real- valued function ||a;||j, on R" defined by 

(5.15) Mp=(^\x,f 

X = {xi, . . . , Xn)- We can also allow p = oo here by setting 

(5.16) Halloo = maxdxil, . . . , |a:„|). 

Thus 1 1 a; lip is a nonnegative real number for each x £ R" and < p < oo 
which is equal to if and only if a; = 0. We also have that 

(5.17) \\tx\\p = \t\\\x\\p 

for each real number t, x £ R", and < p < oo. Here tx denotes the usual 
scalar multiplication of t and the vector x, so that 

(5.18) tx = {tx\, . . . ,tXn)- 
Clearly 

(5.19) ||x|U < Mp 

for all X e R" and < p < oo. More generally, ifO<p<5< oo, then 

(5.20) \\x\U < Mp. 
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Indeed, 

n n 

(5.21) IN|« = ^|x,r < \\xr-^J^\xjf 

3=1 j=l 

= ikroo-"ii^ii^<iNi?- 

When < p < 1 we can take q = l and n = 2 to obtain that 

(5.22) (a + b)P <aP + bP 

for all nonnegative real numbers a, b. For p > 1 a natural counterpart of this is 
the fact that is a convex function of t on the set of nonnegative real numbers. 
In other words, if t, u are nonnegative real numbers and A is a real number such 
that < A < 1, then 

(5.23) {\t+{l-\)uY <\tP + {l-\)vF 

for every real number p>\. 
It is easy to see that 

(5.24) ||a;||p < n^'^ \\x\\^ 

for every x G R" and every positive real number p. In fact, if p, q are positive 
real numbers such that p < q, then 

(5.25) \\x\\p < n(i/f)-(i/9) 

for all X G R". This can be derived from the convexity of the function f^^^ for 
t > 0. 

For any x,y & R" and 1 < p < oo we have that 

(5.26) \\x + y\\p<\\x\\p+\\y\\p. 

This is easy to derive directly from the definitions when p = 1 or oo. In general, 
one can use homogeneity to reduce to showing that 

(5.27) ||Ax+ (1 - A)?y||p < 1 

when x,y £ R" satisfy ||a;||p, ||y||p < 1 and A is a real number such that < 
A < 1, and for p > 1 this can be derived from the convexity of t^, t > 0. 
As a result, when 1 < p < oo, we have that 

(5.28) dp{x,y) = \\x-y\\p 
defines a metric on R". When < p < 1 we can set 

(5.29) dp{x,y)^\\x-y\\l, 

and this also defines a metric on R". To be more precise, this uses the fact that 

(5.30) ||^ + ^||P<||„||P + ||^||P 
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when v,w € R" and < p < I. 

If {M,d{x,y)) and {N,p{u,v)) are metric spaces and / : M ^ is a 
mapping from M into N, then we say that / is Lipschitz if there is a nonnegative 
real number C such that 

(5.31) p{f{x)J{y))<Cd{x,y) 

for all x,y G M. We might say that / is C-Lipschitz in this case, and no- 
tice that 0-Lipschitz mappings are constant. Of course Lipschitz mappings are 
automatically continuous. 

Suppose that {Mi,di{x,y)), (M2, d2(w, w)), and (M3, ^3(2;, tz;)) are metric 
spaces, and that /i : Mi M2 and /2 : M2 M3 arc mappings between them. 
As usual, the composition /2 o fi is the mapping from Mi to M3 defined by 

(5.32) (/2o/i)(x)=/2(/i(x)) 

for all X £ M. It is easy to sec that if fi is a continuous mapping from Mi 
to M2 and /2 is a continuous mapping from M2 to M3, then the composition 
/2 o /i is a continuous mapping from Mi to M3, and that if fi, /2 are Lipschitz, 
then so is the composition /2 o fi . 

It is easy to generate examples of real- valued Lipschitz functions on the real 
line, which can then be composed with some of the basic real-valued Lipschitz 
functions on a metric space mentioned earlier to produce more Lipschitz func- 
tions. In general, if /i, /2 are two real- valued Lipschitz functions on a metric 
space M, then fi + f2, iniii(/i) /2), and max(/i, /2) are also Lipschitz functions 
on M, and a real number times a real- valued Lipschitz function is again a Lip- 
schitz function. For products of Lipschitz functions, or reciprocals of nonzero 
Lipschitz functions, the situation is more complicated, although there are simple 
sufficient conditions for the result to be Lipschitz. 

Now let us look at continuous curves or paths in a metric space. Namely, 
let (M, d{x, y)) be a metric space, and let / be a closed interval in the real line. 
That is, I might be of the form 

(5.33) [a,b] = {x gH : a < X < b} 

for some real numbers a, b with a < 6, in which case / is a closed and bounded 
interval, or I might be an unbounded closed interval, of the form 

(5.34) [a, 00) = {a; G R : a; > 0} 
for some real number a, or 

(5.35) (-00, 6] = {x e R : a; < 6} 

for some real number b, or 

(5.36) (-oo,oo)=R. 

A continuous path in M parameterized by the interval / is simply a con- 
tinuous mapping from I into M. Sometimes we are particularly interested in 
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paths which arc Lipschitz. This can be interpreted as meaning that the path 
has bounded speed. 

Suppose that / = [a, b] is a closed and bounded interval in the real line, and 

that p : I ^ M is a continuous path on / in the metric space (M, d{x, y)). By a 
partition of / we mean a finite sequence V = {tj } of real numbers such that 

(5.37) a = to < ... <tk=b. 

Associated to this partition V we get an approximation to the length of the 
path p, defined by 

k 

(5.38) Al{p,V)=Y,d{p{tj),p{tj-^)). 

j=i 

If this quantity is uniformly bounded over all partitions V of /, then we say 
that the path p has finite length, and we define the length of p, denoted A^(p), 
to be the supremum of A'^{p,P) over all partitions V of I. If p : / ^ M is 
C-Lipschitz for some nonnegative real number C, then p has finite length, and 

(5.39) Ki{p)<C{b-a). 

It is sometimes convenient to allow a = b and fc = in the definition of a 
partition, in which case the path automatically has length 0, and in general a 
path has length if and only if it is constant. 

If (M, d{x, y)) is a metric space, p is an element of M, and r is a positive real 
number, then the open ball in M with center p and radius r is denoted B{p,r) 
and defined by 

(5.40) B{p, r) = {zeM : d{p, z) < r}. 

Similarly, the closed ball with center p and radius r is denoted B{p, r) and defined 
by 

(5.41) B{p, r) = {z€M : d{p, z) < r}. 

We make the convention that a "ball" means an open ball unless otherwise 

specified. 

A subset U of M is said to be open if for every point p €U there is a positive 
real number r such that 

(5.42) B{p,r)CU. 

The union of any family of open sets is open, and the intersection of finitely many 
open sets is open. Note that the empty set and M itself are automatically 
open subsets of M, and one can check that open balls in M are open subsets of 
M. 

A sequence {xj}jli of points in M is said to converge to a point a; in M if 
for every e > there is a positive integer L such that 

(5.43) d{xj,x) < e 
for all j > L. In this case we write 

(5.44) lim Xj = x, 
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and wc call x the limit of the sequence {xj}jZi. It is not difficult to see that 
the limit of a sequence is unique if it exists. 

A subset F of M is said to be closed if every sequence of points in F which 
converges to some point in M has its limit in F. The empty set and M itself are 
automatically closed sets, and one can check that closed balls in M are closed 
subsets of M. The intersection of any family of closed sets is closed, and the 
union of finitely many closed sets is again closed. 

In fact, a subset U of M is open if and only if its complement M\U is closed. 
Recall that the complement of a subset of M in M is given by 

(5.45) M\E = {x G M : X ^ E}. 

Equivalently, a subset F of M is closed if and only if M\F is an open subset of 
M. 

Suppose that M, N are sets and that / is a mapping from M to N. If A is 
a subset of M, then the image of A under / is denoted f{A) and is the subset 
of A'' defined by 

(5.46) fiA) = {fix) -.xeA}. 

In particular, the image of / simply means f{M). If B is a subset of N, then the 
inverse image of B under / is denoted f^^{B) and is the subset of M defined 
by 

(5.47) ,f-\B) = {x&M : f{x)&B}. 

The image of the union of a family of subsets of M under / is equal to the 
union of the images of the individual subsets, and the image of the intersection 
of a family of subsets of M xmdci f is contained in the intersection of the images 
of the individual subsets. The inverse image of the union of a family of subsets 
of N under / is equal to the union of the inverse images of the individual subsets 
of TV, and the inverse image of the intersection of a family of subsets of N is 
equal to the intersection of the corresponding individual inverse images. If B is 
a subset of N, then 

(5.48) M\f-\B) = r\N\B). 

If (M, d{x, y)) and [N, p{u, v)) are metric spaces, then a mapping / from M 
to N is continuous if and only if f~^{V) is an open subset of M for every open 
subset V of N. This is also equivalent to saying that f~^{E) is a closed subset 
of M for every closed subset E of N. Moreover, / is continuous if and only if 
for every sequence {xjjjl^ of points in M which converges to a point a; in M 
we have that the sequence {f{xj)}jZ-^ converges to f{x) in N. 

A sequence {xj}°°^i of points in a metric space (M,d{y,z)) is said to be a 
Cauchy sequence if for every e > there is a positive integer L such that 

(5.49) d{xj,Xk)<e 

for all j, k > L. Every convergent sequence is a Cauchy sequence, and a metric 
space is said to be complete if every Cauchy sequence in the space converges to 
some point in the space. A basic property of Euclidean spaces R" with their 
standard metrics ||a: — t/||2 is that they are complete. 
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If {M,d(x,y)) and {N, p{u,v)) are metric spaces and / is a mapping from 
M to N , then / is said to be uniformly continuous if for every e > there is a 
5 >Q such that 

(5.50) P{f{x),f{y))<e 

for all X, y G M such that d{x, y) < S. It is easy to sec that Lipschitz mappings 
are uniformly continuous. Constant mappings are uniformly continuous trivially, 
and the identity mapping on a metric space is 1-Lipschitz and hence uniformly 
continuous. 

If / : M — *• is uniformly continuous and {xj}j^i is a Cauchy sequence in 
M, then {f{xj)}jZi is a Cauchy sequence in N. In particular, if N is complete, 

then {/(xj)}^^]^ converges in A^. Also, ii f : M ^ N is uniformly continuous 
and {xjj'jli, {yj}j^i are sequences in M such that 

(5.51) lim d{xj,yj) = 0, 

j—KX 

then 

(5.52) lim p{f{xj),f{yj))=0 
too. 

In any metric space (M, d{x, y)), the closure of a subset E is denoted E can 
be defined as the set of points a; G M for which there is a sequence {xjjj^i of 
points in E which converges to x. Thus 

(5.53) ECE 

automatically, and one can check that E is a closed subset of M which is con- 
tained in any other closed subset of M that contains E. A subset E oi M is 
said to be dense in M if = M. 

Suppose that {M,d{x,y)) and {N,p{u,v)) are metric spaces, with A^ com- 
plete, E is a. dense subset of M, and / is a uniformly continuous mapping from 
E to N. Under these conditions, one can show that there is a uniformly con- 
tinuous mapping from M to A'^ which agrees with / on E. This extension is 
unique, and for that matter if /i, /2 are two continuous mappings from one 
metric space into another, then the set of points in the domain where /i and /2 
agree is a closed set. 

If {xj}j^i is a sequence of points in some set A, and if {jk}'kLi is a strictly 
increasing sequence of positive integers, then {xj,^}'^^ is called a subsequence 
of the original sequence {xj}j^i. A subset if of a metric space {M,d{x,y)) is 
said to be compact if every sequence of points in K has a subsequence which 
converges to a point in K. Notice that if y is a nonempty subset of M such 
that K C Y, then K is compact as a subset of M if and only if K is compact 
as a subset of Y, viewed as a metric space on its own, using the restriction of 
the metric from M . 

A compact subset of a metric space is always closed, basically because any 
subsequence of a convergent sequence converges to the same limit. Notice that 
a Cauchy sequence in a metric space which has a convergent subsequence also 
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converges to the same limit. As a result, a Cauchy sequence contained in a 
compact subset of a metric space converges to a point in that subset. 

A compact subset K oi a metric space {M,d{x,y)) is bounded. To see this, 
let p be a point in M, and assume for the sake of a contradiction that K is 
not bounded, so that for each positive integer j there is a point xj € K with 
d{p, Xj) > j. It is easy to see that the sequence {xj}^!^ cannot have a convergent 
subsequence in this case, contradicting the assumption that K is compact. 

A subset E of a. metric space (M, d{x, y j) is said to be totally bounded if for 
every positive real number r there is a finite set F C E such that 

(5.54) EC \J B{x,r). 

xeF 

A compact subset of M is also totally bounded. Indeed, a subset of M is not 
totally bounded if and only if there is an e > and a sequence of points {xj}JLi 
in E such that d{xj,Xk) > e for all positive integers j, k with j ^ k. 

More precisely, a subset of a metric space {M,d{x,y)) is totally bounded 
if and only if every sequence of points in E has a subsequence which is a Cauchy 
sequence. This is not too difficult to show. As a result, a subset of a complete 
metric space is compact if and only if it is closed and totally bounded. In 
particular, closed and bounded subsets of Euclidean spaces are compact. 

Let {M,d{x,y)) and {N,p{u,v)) be metric spaces, and let / be a mapping 
from M to N. We say that / is bounded if the image of / is a bounded subset 
of N. The space of bounded continuous mappings from M to A'' is denoted 
CbiM,N). 

There is a natural metric on Cb{M,N), called the supremum metric, which 
is defined by 

(5.55) 't(/i, /2) = sup{p(/i(.x), /2(a;)) : x e M} 

for /i,/2 G Cb{M,N). Convergence of sequences in Cb{M,N) is equivalent to 
uniform convergence, as compared to pointwise convergence of mappings. A 
basic result, which is not too difficult to show, states that if A'' is a complete 
metric space, then so is C(,(M, N). 

Let us write Lip(M, A'') for the space of Lipschitz mappings from M to N, 
and for each positive real number k, let us write Lipj,(M, N) for the space of k- 
Lipschitz mappings from M to A^. If M is bounded, then Lip(M, A^) is contained 
in Cb{M,N), and for each A; > 0, Lipj,(M, A') is a closed subset of Cb{M,N). 
If M is bounded, p is an clement of M, k is a positive real number, and _B is a 
bounded subset of A^, then the set of / in Lip^(M, A") such that f{p) £ B is a 
bounded subset of Cb{M, N). 

Suppose that (M,d{x,y)) and {N, p{u,v)) are metric spaces, with M com- 
pact. If / is a continuous mapping from M to N, then the image of / is a 
compact subset of A'. In particular, every continuous mapping from M to N 
is bounded in this case. If M, N are both compact, then one can show that 
Lip^(M, A^) is a compact subset of Cb{M, N) for every positive real number k. 
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